Nesvilab / MSFragger

Ultrafast, comprehensive peptide identification for mass spectrometry–based proteomics
https://msfragger.nesvilab.org
103 stars 7 forks source link

ValueError: Specified a sep and a delimiter; you can only specify one. #167

Closed shahbazymoh closed 3 years ago

shahbazymoh commented 3 years ago

Dear software tool developer,

I have run the FragPipe and encountered the following error:


Traceback (most recent call last): File "C:\Users\msha0053\fragpipe\tools\msfragger_pep_split.py", line 477, in main() File "C:\Users\msha0053\fragpipe\tools\msfragger_pep_split.py", line 468, in main write_combined_scores_histo() File "C:\Users\msha0053\fragpipe\tools\msfragger_pep_split.py", line 142, in write_combined_scores_histo scores_histos = [sum(pd.read_csv(ee / (e.stem + '_scores_histogram.tsv'), dtype=np.uint64, delimiter='\t', header=None, sep='\t').values for ee in tempdir_parts) File "C:\Users\msha0053\fragpipe\tools\msfragger_pep_split.py", line 142, in scores_histos = [sum(pd.read_csv(ee / (e.stem + '_scores_histogram.tsv'), dtype=np.uint64, delimiter='\t', header=None, sep='\t').values for ee in tempdir_parts) File "C:\Users\msha0053\fragpipe\tools\msfragger_pep_split.py", line 142, in scores_histos = [sum(pd.read_csv(ee / (e.stem + '_scores_histogram.tsv'), dtype=np.uint64, delimiter='\t', header=None, sep='\t').values for ee in tempdir_parts) File "C:\Users\msha0053\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\util_decorators.py", line 311, in wrapper return func(*args, **kwargs) File "C:\Users\msha0053\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers\readers.py", line 571, in read_csv kwds_defaults = _refine_defaults_read( File "C:\Users\msha0053\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers\readers.py", line 1303, in _refine_defaults_read raise ValueError("Specified a sep and a delimiter; you can only specify one.") ValueError: Specified a sep and a delimiter; you can only specify one. DONE: slice 18 of 18 Process 'MSFragger' finished, exit code: 1 Process returned non-zero exit code, stopping

Cancelling 125 remaining tasks


Would you please help me to resolve this? The pipeline can complete the search on all assigned database slices in the DDA DB search to generate a spectral library, but I have faced it after the last slice. I attached the validation tab's screen just in case.

image

I really appreciate any help you can provide. Moh

guoci commented 3 years ago

Please try the following fix: https://drive.google.com/file/d/1joOIbUAfB5OV2FOACAql5sgJgk-X_Wqd/view?usp=sharing

shahbazymoh commented 3 years ago

Thanks, @guoci for your reply. I tested what you sent and the error has been resolved. But at the end of the run, I can't see any lib.tsv file! Just pep.xml can be found there. Am I on a right track? I used EasyPQP to generate a spectral library. image

guoci commented 3 years ago

@shahbazymoh you should see a library.tsv file generated, if not please post your log and let me investigate.

shahbazymoh commented 3 years ago

Please find the attached log. Many thanks in advance @guoci

log_20210811_shahbazymoh.txt

guoci commented 3 years ago

@shahbazymoh can you send me the following files? E:\Data\SpecLib_C1RB57_MSFragger_10082021\LSIP_1\interact-F120190510_PF_C1R_B5701_Library_W632_MS_pool1.pep.xml \\ad.monash.edu\shared\R-MNHS-SOBS-BIOCHEM\PurcellMS\Mohammad\DIAbench_project_data\DDA data\F120191219_PF_C1R-B5701_w632_MS_12122019_S3_DDA_uncalibrated.mgf F120191219_PF_C1R-B5701_w632_MS_12122019_S3_DDA.psmpkl F120191219_PF_C1R-B5701_w632_MS_12122019_S3_DDA.peakpkl E:\Data\SpecLib_C1RB57_MSFragger_10082021\LSIP_1\psm.tsv E:\Data\SpecLib_C1RB57_MSFragger_10082021\LSIP_1\peptide.tsv E:\Data\SpecLib_C1RB57_MSFragger_10082021\LSIP_1\irt.tsv E:\Data\SpecLib_C1RB57_MSFragger_10082021\LSIP_1\filelist_easypqp_library.txt

shahbazymoh commented 3 years ago

Hi @guoci , all requested files have been zipped and attached. Thanks

shahbazymoh_requested files_12082021.zip

guoci commented 3 years ago

Hi @shahbazymoh , please try the following fix. https://drive.google.com/file/d/1iQWSC3uImaCKus72SEnmSg2fpOxDZ15g/view?usp=sharing

shahbazymoh commented 3 years ago

Many Thanks, @guoci , I have run what you sent but was faced with another error this time:

image

The log is attached as well. Please have a look. Best, Moh

log_2021-08-13_shahbazymoh.txt

guoci commented 3 years ago

can you post the file E:\Data\SpecLib_C1RB57_MSFragger_12082021\LSIP_1\filelist_easypqp_library.txt?

shahbazymoh commented 3 years ago

Sure, please find the attached file. Thanks

filelist_easypqp_library.txt

guoci commented 3 years ago

Hi @shahbazymoh , please try the following fix. https://drive.google.com/file/d/12GbLIm9zjH8H8U1-jPjDU8ySrr_9usOE/view?usp=sharing

shahbazymoh commented 3 years ago

Hi @guoci , thanks for the fixed package. I tested and faced again with another error. I attached the log and also two setting subsections that I suspect on them. Please have a look and let me know what you think. Many thanks

spectral library setting: image

output setting: image

recorded log: log_2021-08-15_22-55-04.txt

guoci commented 3 years ago

From, the log, I think you should see a library at E:\Data\SpecLib_C1RB57_MSFragger_15082021\LSIP_1\library.tsv Can you send me all files in the folder E:\Data\SpecLib_C1RB57_MSFragger_15082021\LSIP_2 and \\ad.monash.edu\shared\R-MNHS-SOBS-BIOCHEM\PurcellMS\Mohammad\DIAbench_project_data\DDA data\F120190510_PF_C1R_B5701_Library_W632_MS_pool2_uncalibrated.mgf

shahbazymoh commented 3 years ago

Hi @guoci , Yes, but that's related to only the first fraction. I need a combined library from all. Please find the attached zip file but I could not include "F120190510_PF_C1R_B5701_Library_W632_MS_pool2_uncalibrated.mgf" because Github can't accept this format although I sent the same format already! No idea. If you provide me an email, I will send that as well.

LSIP_2.zip

guoci commented 3 years ago

The library for LSIP_2 cannot be generated as there are not enough peptides that could be found for alignment, but you are using cIRT for alignment

sarah-haynes commented 3 years ago

Hi Moh, we noticed the precursor tolerance is set to -/+ 10 ppm, which is a little narrower than we typically recommend. Could you try -/+ 20 ppm and see if that helps get you a few more PSMs?

Sarah

fcyu commented 3 years ago

There are thousands of PSMs identified in each run, and the mass calibration table shows that the precursor mass precision is quite high (MAD = 0.6), so I don't think narrow tolerance is the reason.

They are using non-specific search but the ciRT is from tryptic peptides, which makes the number of overlapped peptides low. @shahbazymoh can you try with the "automatic selection of a run as reference RT"?

Best,

Fengchao

anesvi commented 3 years ago

Yes use automated selection

If it is fractionated data, you may need to add one unfractionated run

If you have fractionated DDA and plan to apply to unfractionated DIA, contact me by email

Sent from my iPhone

On Aug 18, 2021, at 5:36 PM, Fengchao @.***> wrote:

 External Email - Use Caution

There are thousands of PSMs identified in each run, and the mass calibration table shows that the precursor mass precision is quite high (MAD = 0.6), so I don't think narrow tolerance is the reason.

They are using non-specific search but the ciRT is from tryptic peptides, which makes the number of overlapped peptides low. @shahbazymohhttps://github.com/shahbazymoh can you try with the "automatic selection of a run as reference RT"?

Best,

Fengchao

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/MSFragger/issues/167#issuecomment-901444351, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM64IFAIMXVWGNHQ6DXDT5QRHVANCNFSM5BYLQ5SQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email.


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

shahbazymoh commented 3 years ago

Many Thanks, @guoci @sarah-haynes @fcyu, and @anesvi , for your helpful comments and sorry for the late reply. I could resolve the issue after applying a wider MS1 tolerance although I am analyzing a dataset acquired by Thermo Fusion Orbitrap and I think it should be 10 ppm. I have checked this setting over other software tools successfully. Moreover, I adjusted the "RT calibration" setting as "Automatic selection of a run as reference".

@anesvi , yes I am using a "peptide-centric" approach to generate first a spectral library by DDA runs and then analyze my DIA datasets. I am searching 9 fractionated pools and 3 independent biological replicates of HLA-bound peptides. I try to use this spectral library as an input for DIA-NN. If you have any recommendations, please let me know. Thanks

anesvi commented 3 years ago

What described (building a library from fractionated DDA + some unfractionated DIA) should work. I have done it myself on such a workflow. We have a tutorial on the website. Best, Alexey

From: shahbazymoh @.> Sent: Friday, August 20, 2021 10:24 PM To: Nesvilab/MSFragger @.> Cc: Nesvizhskii, Alexey @.>; Mention @.> Subject: Re: [Nesvilab/MSFragger] ValueError: Specified a sep and a delimiter; you can only specify one. (#167)

External Email - Use Caution

Many Thanks, @guocihttps://github.com/guoci @sarah-hayneshttps://github.com/sarah-haynes @fcyuhttps://github.com/fcyu, and @anesvihttps://github.com/anesvi , for your helpful comments and sorry for the late reply. I could resolve the issue after applying a wider MS1 tolerance although I am analyzing a dataset acquired by Thermo Fusion Orbitrap and I think it should be 10 ppm as @fcyuhttps://github.com/fcyu pointed out. I have checked this setting over other software tools successfully. Moreover, I adjusted the "RT calibration" setting as "Automatic selection of a run as reference".

@anesvihttps://github.com/anesvi , yes I am using a "peptide-centric" approach to generate first a spectral library by DDA runs and then analyze my DIA datasets. I am searching 9 fractionated pools and 3 independent biological replicates of HLA-bound peptides. I try to use this spectral library as an input for DIA-NN. If you have any recommendations, please let me know. Thanks

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/MSFragger/issues/167#issuecomment-903039649, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM63SMBSUS4GSW5T7V2LT54E23ANCNFSM5BYLQ5SQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email.


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

sarah-haynes commented 3 years ago

tutorial: https://fragpipe.nesvilab.org/docs/tutorial_DIA.html

shahbazymoh commented 3 years ago

Thanks all. That worked properly for me.