compomics / peptide-shaker

Interpretation of proteomics identification results
http://compomics.github.io/projects/peptide-shaker.html
48 stars 19 forks source link

Peptide shaker fails with any of the new versions. label: help wanted #466

Closed subinamehta closed 3 years ago

subinamehta commented 3 years ago

Hello,

I have been trying to incorporate the new version of SGPS into the GTN workflows, seems like Peptide shaker's new versions keep failing. Could you please check on this? https://usegalaxy.eu/u/subina/h/copy-of-error-updated-gtn-workflow---subina

Thanks, Subina

hbarsnes commented 3 years ago

Hi Subina,

We are looking into it and I can confirm that we are able to reproduce the issue with your dataset on our end. I cannot promise when we'll be able to come up with a fix though, as it seems to be related to the database backend and such issues are inherently tricky to debug.

Best regards, Harald

hbarsnes commented 3 years ago

Hi Subina,

The problem seems to be your FASTA file and the fact that the protein accession numbers are identical to the protein/peptide sequences they represent. For example:

>generic|AAAAADVEQEVNRAKEALR
AAAAADVEQEVNRAKEALR

This results in the database not being able to separate between the proteins and the peptide sequences for the cases where these are the same, such as when the whole protein sequence is detected as a single peptide.

By changing your FASTA file headers slightly, the problem is resolved. For example:

>generic|pAAAAADVEQEVNRAKEALR
AAAAADVEQEVNRAKEALR

The only change being the addition of the lower case "p" in the accession number.

Probably easiest to make this change in the FASTA file and then rerun the search.

Best regards, Harald