compomics / peptide-shaker

Interpretation of proteomics identification results
http://compomics.github.io/projects/peptide-shaker.html
48 stars 18 forks source link

Error While Importing Data to PeptideShaker #362

Closed jkrebs25 closed 5 years ago

jkrebs25 commented 5 years ago

Hi,

An error occurs when I attempt to import data from SearchGUI to Peptide Shaker. The data fails to import. I've shown the bug report below.

MGSF_error_peptideshaker

Mon Jul 01 15:35:08 CDT 2019: PeptideShaker version 1.16.40. Memory given to the Java virtual machine: 4660396032. Total amount of memory in the Java virtual machine: 128974848. Free memory: 107826672. Java version: 1.8.0_211. 1714 script command tokens (C) 2009 Jmol Development Jmol Version: 12.0.43 2011-05-03 14:21 java.vendor: Oracle Corporation java.version: 1.8.0_211 os.name: Windows 7 memory: 54.2/163.1 processors available: 8 useCommandThread: false java.lang.NullPointerException at com.compomics.util.experiment.biology.Peptide.getPotentialModificationSites(Peptide.java:1067) at com.compomics.util.experiment.biology.PTMFactory.getExpectedPTMs(PTMFactory.java:436) at eu.isas.peptideshaker.fileimport.PsmImporter.importAssumptions(PsmImporter.java:526) at eu.isas.peptideshaker.fileimport.PsmImporter.importPsm(PsmImporter.java:331) at eu.isas.peptideshaker.fileimport.PsmImporter.access$000(PsmImporter.java:69) at eu.isas.peptideshaker.fileimport.PsmImporter$PsmImporterRunnable.run(PsmImporter.java:1415) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)

hbarsnes commented 5 years ago

Hi,

Which PTMs are included in your search? And would it be possible for you to share the data set with us so that we can try to reproduce the issue on our end?

Best regards, Harald

jkrebs25 commented 5 years ago

Hi Harald,

Thanks so much for the quick reply!

The PTM's used are acetylation https://urldefense.proofpoint.com/v2/url?u=https-3A__www.sciencedirect.com_topics_chemistry_acetylation&d=DwMFAw&c=OCIEmEwdEq_aNlsP4fF3gFqSN-E3mlr2t9JcDdfOZag&r=Sz38qN7JT9D3n7HZJNlURUr2tj0EDqAUFV9GbBruHnw&m=_I10gPjowP_LwSs8LqtVe5skVnhrZrsBwVX_edXOFso&s=EPX0xMoFvodmqkXK1gtgFOW7WkM5KG41HRVU-15TPo8&e= (K- and N-terminus), amidation, phosphorylation https://urldefense.proofpoint.com/v2/url?u=https-3A__www.sciencedirect.com_topics_chemistry_phosphorylation&d=DwMFAw&c=OCIEmEwdEq_aNlsP4fF3gFqSN-E3mlr2t9JcDdfOZag&r=Sz38qN7JT9D3n7HZJNlURUr2tj0EDqAUFV9GbBruHnw&m=_I10gPjowP_LwSs8LqtVe5skVnhrZrsBwVX_edXOFso&s=AiUHX85ULvj4I2fQVxQJeNV_dhyiqMsYOEfGHcCq-A8&e= (S,T, and Y), half-disulfide bond per cysteine residue https://urldefense.proofpoint.com/v2/url?u=https-3A__www.sciencedirect.com_topics_chemistry_cysteine-2Dresidue&d=DwMFAw&c=OCIEmEwdEq_aNlsP4fF3gFqSN-E3mlr2t9JcDdfOZag&r=Sz38qN7JT9D3n7HZJNlURUr2tj0EDqAUFV9GbBruHnw&m=_I10gPjowP_LwSs8LqtVe5skVnhrZrsBwVX_edXOFso&s=Dq3o-ugWgAEX_yM2-CfSGgyvasbEyR5Zb5hEgiQVVBI&e=, pyroglutamination from E and Q, and Met oxidation.

I have attached two data sets below. One is from a run using OMSSA and the other is a run using MS-GF+.

Best,

Jessi Krebs Abd_1_0712.msgf.mzid https://drive.google.com/a/illinois.edu/file/d/1Ig2_BbkLiDdgDuo-hJMH47VyFaZ7nX3s/view?usp=drive_web Abd_1_0712.omx https://drive.google.com/a/illinois.edu/file/d/1jTLmRHu-F3skF22ooTHnUnTTfun8HGUd/view?usp=drive_web

On Tue, Jul 2, 2019 at 7:54 AM Harald Barsnes notifications@github.com wrote:

Hi,

Which PTMs are included in your search? And would it be possible for you to share the data set with us so that we can try to reproduce the issue on our end?

Best regards, Harald

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/compomics/peptide-shaker/issues/362?email_source=notifications&email_token=AMP7ZJOYME6Y5UJ76GQB57TP5NFWRA5CNFSM4H4V2QV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZBFLDQ#issuecomment-507663758, or mute the thread https://github.com/notifications/unsubscribe-auth/AMP7ZJLF7H3P5E4VSXIBKBLP5NFWRANCNFSM4H4V2QVQ .

hbarsnes commented 5 years ago

Hi Jessi,

Thanks for the files. However, to load them in PeptideShaker to reproduce the issue we also need the mgf and FASTA files, plus the search parameter file.

Best regards, Harald

jkrebs25 commented 5 years ago

Hi Harald,

I'm having a hard time getting my emails to send. Have you received the files?

Thanks again!

Jessi

Abd_1_0712.mgf https://drive.google.com/a/illinois.edu/file/d/1RuxmfB1NXaldWP4e9S9nZDAu-LXB5Wss/view?usp=drive_web

On Tue, Jul 2, 2019 at 6:24 PM Harald Barsnes notifications@github.com wrote:

Hi Jessi,

Thanks for the files. However, to load them in PeptideShaker to reproduce the issue we also need the mgf and FASTA files, plus the search parameter file.

Best regards, Harald

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/compomics/peptide-shaker/issues/362?email_source=notifications&email_token=AMP7ZJNY64AABESTIU5GCO3P5PPSXA5CNFSM4H4V2QV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZC2FPI#issuecomment-507880125, or mute the thread https://github.com/notifications/unsubscribe-auth/AMP7ZJKCZMMQCMNJGV6WEGLP5PPSXANCNFSM4H4V2QVQ .

hbarsnes commented 5 years ago

Hi Jessi,

I'm having a hard time getting my emails to send. Have you received the files?

I now have the search results files and the mgf file. But still missing the FASTA file and the search parameters.

Best regards, Harald

jkrebs25 commented 5 years ago

Hi Harald,

Here are the other two files. I'm sorry for all the confusion.

Thanks again!

Jessi Krebs

On Thu, Jul 4, 2019 at 4:00 AM Harald Barsnes notifications@github.com wrote:

Hi Jessi,

I'm having a hard time getting my emails to send. Have you received the files?

I now have the search results files and the mgf file. But still missing the FASTA file and the search parameters.

Best regards, Harald

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/compomics/peptide-shaker/issues/362?email_source=notifications&email_token=AMP7ZJJ2LLQGBJ7NWZ4XT6TP5W3Y7A5CNFSM4H4V2QV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZGZVOY#issuecomment-508402363, or mute the thread https://github.com/notifications/unsubscribe-auth/AMP7ZJK57VDAGEWMTXKFBS3P5W3Y7ANCNFSM4H4V2QVQ .

hbarsnes commented 5 years ago

Hi Jessi,

I'm afraid that attaching files when replying to the issues via email does not work. I think that is what you have done given that there are no links included as for the other files you shared earlier?

Best regards, Harald

jkrebs25 commented 5 years ago

Sorry for all the trouble. Do these links work?

Best,

Jessi Krebs

Test Run Settings.zip ELH.zip

hbarsnes commented 5 years ago

Hi Jessi,

The links work but the FASTA file only contains one sequence?

Best regards, Harald

jkrebs25 commented 5 years ago

Hi Harald,

Yes, the fasta only contains one sequence. The search was originally taking longer than we were able to wait for so we shortened the file to one protein.

Best, Jessi

hbarsnes commented 5 years ago

Hi Jessi,

Yes, the fasta only contains one sequence. The search was originally taking longer than we were able to wait for so we shortened the file to one protein.

You are aware that having a FASTA file consisting of only one protein, combined with using 10 variable PTMs and an unspecific search, basically forces the search engines to identify peptides from the given protein? This means that you cannot really trust the identifications as you're not giving the search engines any choice but to identify peptides from the given protein.

Is there any particular reason why you need to use an unspecific search btw? Are you not using a specific enzyme to cleave the protein(s) into peptides before inserting the sample into the instrument?

Also, I was looking closer at the "Half-Disulfide Bond per Cysteine Residue" PTM, and I found its target pattern quite odd? It targets K and R (unless followed by P), but the actual modification is only supposed to target cysteines?

Best regards, Harald

hbarsnes commented 5 years ago

Hi again,

One more comment on the PTMs selected. There is no point in including both amidated peptide C-term and amidated protein C-term, as the first one also covers the second one. :)

Best regards, Harald

jkrebs25 commented 5 years ago

Hi Harald,

Thank you so much for all of the help!

Yes, we are aware of the situation with the fasta. The results from this run will not be used for any important analysis. It is just to learn the software before running the search with the full fasta (as last time this took many hours and mistakes were made, and therefore the data wasn't usable). So I am just using this to make sure I can use the software correctly first.

In regard to the enzyme, no we did not use an enzyme to cleave the proteins.

And thank you for the comments on the PTM's! Those were mistakes that I did not notice before running the search.

Best,

Jessi Krebs

On Thu, Jul 11, 2019 at 6:01 AM Harald Barsnes notifications@github.com wrote:

Hi again,

One more comment on the PTMs selected. There is no point in including both amidated peptide C-term and amidated protein C-term, as the first one also covers the second one. :)

Best regards, Harald

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/compomics/peptide-shaker/issues/362?email_source=notifications&email_token=AMP7ZJIRGIFIPUN3DECQCETP64HHRA5CNFSM4H4V2QV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZWK6OQ#issuecomment-510439226, or mute the thread https://github.com/notifications/unsubscribe-auth/AMP7ZJOUTOQA52HRIQHIPHTP64HHRANCNFSM4H4V2QVQ .

hbarsnes commented 5 years ago

Hi Jessi,

I'd also be careful with using as many variable PTMs as you do, as this basically offers the search engines too many changes to make mistakes, given that there are so many possible combinations of PTMs that may result in a high-scoring peptide-spectrum-match simply by chance.

However, if I correct the "Half-Disulfide Bond per Cysteine Residue" PTM to only target cysteines (and remove the amidated protein C-term PTM) and research the data in SearchGUI, I'm then able to load the results in PeptideShaker without any issues. Can you confirm that this is the case on your end as well?

Best regards, Harald

jkrebs25 commented 5 years ago

Hi Harald,

Yes, I have been able to load the results into PeptideShaker. Thank you so much for all of the help!

Best, Jessi