compomics / peptide-shaker

Interpretation of proteomics identification results
http://compomics.github.io/projects/peptide-shaker.html
48 stars 19 forks source link

Search parameter files created via the GUI and the CLI are not identical #165

Open Thys3Potgieter opened 8 years ago

Thys3Potgieter commented 8 years ago

Hi Guys, I hope you can help me, I am new to PeptideShaker and am using the Bionformatics for Proteomics tutorial (April 2016) and resources you provided there to develop and and validate a command-line pipeline for use on a linux cluster. I am getting very different results when I analyze the mgf file with paremeters generated with SearchGUI (as per the tutorial) then when using the .par provided in the resources of Tutorial 1.4. I can't figure out why this is, as I assume they should be the same. When using the data provided in Tutorial 1.3 but the .par I produced using SearchGUI, I get no MS2 Quant values, and the percentage coverage for the protein Q15149 I get (0.26% and only one PSM) is very different from that described in the tutorial. The only difference I believe is the parameter file as I can produce the expected results when I use the .par file provided in the resources. I tried using the IdentificationParametersCLI to produce the correct .par file as well, but am experiencing the same thing. When I load the different .par files, they all have show the same settings in the GUI but seem to give very different results.

I attach the .par file from Tutorial 1.4 (Analysed using Mac, java build 1.8.0_65-b17): Tutorial-April-2016.par.txt

Parameters produced using SearchGUI as per the tutorial (using Mac, java build 1.8.0_65-b17): SearchGUI.par.txt screenshot 2016-05-02 15 50 58 screenshot 2016-05-02 15 51 32

Paramaters produced using PeptideShaker IdentificationParametersCLI on Linux server, jdk1.8.0_31: identification.par.txt

I am using SearchGUI-2.8.4, PeptideShaker-1.10.1 (Please not the .txt after the .par files is simply so github will let me paste them here). The original files are too big to upload easily, but if needed I will upload them using Google Drive.

I will really appreciate feedback on this, as I hope to use these tools heavily for future work, especially the commandline interface.

My current workaround on the server is to load in the working .par file from the tutorial and modify it with the parametersCLI on the server, but I am unable to have luck with just SearchGUI or the CLI alone.

Thanks!

Thys3Potgieter commented 8 years ago

On further thought, this may be due to differences in the default advanced settings, but I could not find any yet on cursory inspection, and the difference in results using the different .par files is big. Thys

hbarsnes commented 8 years ago

Hi Thys,

Thanks for telling us about this. If I diff the two files (they are just text files in the json format), I see a difference in the maxMassDeviation as part of the peptideAssumptionFilter. This bug should have been fixed in the latest versions of SearchGUI (v2.8.5) and PeptideShaker (v1.10.2).

Would be great if you could retest with these versions to confirm that the problem has been fixed on your side as well?

Best regards, Harald

Thys3Potgieter commented 8 years ago

Hi Harald,

Thanks for the quick response!

I have tested it with the IdentificationParametersCLI using the new versions (only commandline options but without using the template .par as before) and it looks good (but not 100% the same as tutorial):

Using IdentificationParametersCLI: java -cp SearchGUI-2.8.5.jar eu.isas.searchgui.cmd.IdentificationParametersCLI \ -out ${output_folder}"/identification.par" \ -db ${fasta_file%.fasta}"_concatenated_target_decoy.fasta" \ -prec_tol '10' \ -prec_ppm '1' \ -frag_tol '0.02' \ -frag_ppm '0' \ -enzyme 'Trypsin' \ -fixed_mods "Carbamidomethylation of C" \ -variable_mods "Oxidation of M" \ -min_charge '2' \ -max_charge '4' \ -mc '2' \ -fi 'b' \ -ri 'y'

screenshot 2016-05-04 12 34 42 identification.par.txt

With SearchGUI generated .par: screenshot 2016-05-04 11 29 37 newsearch.par.txt

With example .par: screenshot 2016-05-04 12 38 37

Tutorial-April-2016.par.txt

The difference in total id's and confident assignments between the CLI and SearchGUI,are very slightly different. Both differ from the results using the .par provided in the tutorial (same basic settings). Do you recommend using the CLI as I did now without a template or to modify the template from the tutorial? I am primarily working on the command line.

Thanks! Thys

Thys3Potgieter commented 8 years ago

Just to clarify, the CLI analysis (SearchGUI and PeptideShaker) was run on HPC linux cluster with different version of java, so the small difference is probably not relevant at all. Maybe the difference between SearchGUI using own generated and the tutorial .par files reflects differences in the advanced settings or updates to the software? But all in all they are very similar now.

Thanks!

hbarsnes commented 8 years ago

Hi again,

The results when using the GUI or the CLI option to create a parameter file ought to be identical. However, from your example files I've detected some very minor differences in the way we handle a couple of the parameters. We will now look into each of these in detail and make sure that all of the parameters are set up in exactly the same way in both the GUI and the CLI. We'll let you know as soon as a new version is available for testing. Thanks again for pointing out this discrepancy.

Best regards, Harald

Thys3Potgieter commented 8 years ago

Thank you Harald, Please let me know if I can help testing at any point, Kind regards Thys