Closed tobiasko closed 10 months ago
I published a hotfix release for spectrum-io (v0.3.3) because it was only there to check if we have default values for the mass tolerance and unit. As long as you supply these yourself, it should be fine. If you install the newest release of oktoberfest (v0.5.0), this error should be gone. The release will be published tonight and the issue will be closed accordingly. Please reopen should you still encounter the problem.
Hi @picciama,
I updated to v 0.5.0 and get another error:
python3 -m oktoberfest --config_path ~/CEcalibration_config.json
2023-10-04 11:07:54,458 - INFO - oktoberfest::main Oktoberfest version 0.5.0
Copyright 2023, Wilhelmlab at Technical University of Munich
2023-10-04 11:07:54,460 - INFO - oktoberfest.utils.config::read Reading configuration from /home/tobiasko/CEcalibration_config.json
2023-10-04 11:07:54,471 - INFO - oktoberfest.utils.config::read Reading configuration from /home/tobiasko/CEcalibration_config.json
2023-10-04 11:07:54,472 - INFO - oktoberfest.runner::run_ce_calibration Found 45 files in the spectra directory.
2023-10-04 11:07:54,473 - INFO - oktoberfest.runner::_preprocess Converting search results from /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.pepXML to internal search result.
2023-10-04 11:07:54,473 - INFO - spectrum_io.search_result.search_results::generate_internal Found search results in internal format at /scratch/tobiasko/msms/msms.prosit, skipping conversion
2023-10-04 11:07:54,650 - INFO - oktoberfest.runner::_preprocess Read 98622 PSMs from /scratch/tobiasko/msms/msms.prosit
2023-10-04 11:07:54,771 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/tobiasko/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_02.rescore
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /usr/lib/python3.9/runpy.py:197 in _run_module_as_main │
│ │
│ 194 │ main_globals = sys.modules["__main__"].__dict__ │
│ 195 │ if alter_argv: │
│ 196 │ │ sys.argv[0] = mod_spec.origin │
│ ❱ 197 │ return _run_code(code, main_globals, None, │
│ 198 │ │ │ │ │ "__main__", mod_spec) │
│ 199 │
│ 200 def run_module(mod_name, init_globals=None, │
│ │
│ /usr/lib/python3.9/runpy.py:87 in _run_code │
│ │
│ 84 │ │ │ │ │ __loader__ = loader, │
│ 85 │ │ │ │ │ __package__ = pkg_name, │
│ 86 │ │ │ │ │ __spec__ = mod_spec) │
│ ❱ 87 │ exec(code, run_globals) │
│ 88 │ return run_globals │
│ 89 │
│ 90 def _run_module_code(code, init_globals=None, │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/__main__.py:39 in │
│ <module> │
│ │
│ 36 │
│ 37 if __name__ == "__main__": │
│ 38 │ traceback.install() │
│ ❱ 39 │ main() # pragma: no cover │
│ 40 │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/__main__.py:34 in main │
│ │
│ 31 │ logger.info(f"Oktoberfest version {__version__}\n{__copyright__}") │
│ 32 │ │
│ 33 │ args = _parse_args() │
│ ❱ 34 │ runner.run_job(args.config_path) │
│ 35 │
│ 36 │
│ 37 if __name__ == "__main__": │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py:366 in run_job │
│ │
│ 363 │ if job_type == "SpectralLibraryGeneration": │
│ 364 │ │ generate_spectral_lib(config_path) │
│ 365 │ elif job_type == "CollisionEnergyCalibration": │
│ ❱ 366 │ │ run_ce_calibration(config_path) │
│ 367 │ elif job_type == "Rescoring": │
│ 368 │ │ run_rescoring(config_path) │
│ 369 │ else: │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py:229 in │
│ run_ce_calibration │
│ │
│ 226 │ proc_dir = config.output / "proc" │
│ 227 │ proc_dir.mkdir(parents=True, exist_ok=True) │
│ 228 │ │
│ ❱ 229 │ _preprocess(spectra_files, config) │
│ 230 │ │
│ 231 │ processing_pool = JobPool(processes=config.num_threads) │
│ 232 │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py:45 in │
│ _preprocess │
│ │
│ 42 │ │ search_results = pp.filter_peptides_for_model(peptides=search_results, model=con │
│ 43 │ │ │
│ 44 │ │ # split search results │
│ ❱ 45 │ │ pp.split_search( │
│ 46 │ │ │ search_results=search_results, │
│ 47 │ │ │ output_dir=config.output / "msms", │
│ 48 │ │ │ filenames=[spectra_file.stem for spectra_file in spectra_files], │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessi │
│ ng.py:314 in split_search │
│ │
│ 311 │ for filename in filenames: │
│ 312 │ │ output_file = (output_dir / filename).with_suffix(".rescore") │
│ 313 │ │ logger.info(f"Creating split msms.txt file {output_file}") │
│ ❱ 314 │ │ grouped_search_results.get_group(filename).to_csv(output_file) │
│ 315 │
│ 316 │
│ 317 def merge_spectra_and_peptides(spectra: pd.DataFrame, search: pd.DataFrame) -> Spectra: │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/pandas/core/groupby/groupby.py:817 in │
│ get_group │
│ │
│ 814 │ │ │
│ 815 │ │ inds = self._get_index(name) │
│ 816 │ │ if not len(inds): │
│ ❱ 817 │ │ │ raise KeyError(name) │
│ 818 │ │ │
│ 819 │ │ return obj._take_with_is_copy(inds, axis=self.axis) │
│ 820 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_02'
the msms.prosit
file looks like:
head /scratch/tobiasko/msms/msms.prosit
RAW_FILE,SCAN_NUMBER,MODIFIED_SEQUENCE,PRECURSOR_CHARGE,SCAN_EVENT_NUMBER,MASS,SCORE,REVERSE,SEQUENCE,PEPTIDE_LENGTH
LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01,5923,HGSNIEAM[UNIMOD:35]SK,2,10,1088.4927,11.824,False,HGSNIEAMSK,10
LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01,6009,HVGDM[UNIMOD:35]GNVK,2,13,971.4494,10.718,False,HVGDMGNVK,9
LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01,6014,VSGTLDTPEK,3,14,1048.5232,10.549,False,VSGTLDTPEK,10
LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01,6074,HGSNIEAM[UNIMOD:35]SK,2,15,1088.4928,22.052,False,HGSNIEAMSK,10
LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01,6162,HVDMVLEK,2,22,970.5026,10.51,True,HVDMVLEK,8
LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01,6165,HVGDM[UNIMOD:35]GNVK,2,23,971.4493,32.305,False,HVGDMGNVK,9
LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01,6300,GAHLPHK,2,32,760.4279,10.492,True,GAHLPHK,7
LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01,6309,AAHDNM[UNIMOD:35]DIDK,3,34,1144.4819,13.032,False,AAHDNMDIDK,10
LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01,6319,VIAHTQM[UNIMOD:35]R,2,36,970.5012,14.518,False,VIAHTQMR,8
there is no other file in the output folder:
ls -la /scratch/tobiasko/msms/
total 11096
drwxrwxr-x+ 1 tobiasko SG_Employees 22 Sep 22 13:05 .
drwxrwxr-x+ 1 tobiasko SG_Employees 88 Oct 4 11:07 ..
-rw-rw-r--+ 1 tobiasko SG_Employees 11360807 Sep 22 13:05 msms.prosit
Another thing: I can find the run_oktoberfest.py
script anymore in the latest version:
ls -la
total 40
drwxr-xr-x 9 tobiasko SG_Employees 4096 Oct 4 10:57 .
drwxr-xr-x 123 tobiasko SG_Employees 8192 Oct 4 10:57 ..
drwxr-xr-x 3 tobiasko SG_Employees 75 Oct 4 10:57 data
-rw-r--r-- 1 tobiasko SG_Employees 1411 Oct 4 10:57 __init__.py
-rw-r--r-- 1 tobiasko SG_Employees 1009 Oct 4 10:57 __main__.py
drwxr-xr-x 3 tobiasko SG_Employees 76 Oct 4 10:57 plotting
drwxr-xr-x 3 tobiasko SG_Employees 59 Oct 4 10:57 predict
drwxr-xr-x 3 tobiasko SG_Employees 81 Oct 4 10:57 preprocessing
drwxr-xr-x 2 tobiasko SG_Employees 110 Oct 4 10:57 __pycache__
drwxr-xr-x 3 tobiasko SG_Employees 75 Oct 4 10:57 rescore
-rw-r--r-- 1 tobiasko SG_Employees 14279 Oct 4 10:57 runner.py
drwxr-xr-x 3 tobiasko SG_Employees 134 Oct 4 10:57 utils
If this is intended and will stay like this in the future, you might update this here and replace it with this
Concerning the key error you get: This is likely because you provide a folder that contains a raw file with a name that is not present in the msms.prosit file. Please check the following potential issues:
Meanwhile, I will implement a check that prints a warning if no PSMs for a provided filename could be found in the search results.
Concerning the second point: A lot of code was cleaned up for the 0.5.0 release so run_oktoberfest.py was integrated into runner.py. Well spotted, I will correct the documentation on github to reflect what is written on oktoberfest.readthedocs.io
Hmmm, the spectra
folder contains many more raw files than covered by the .pepxml file:
ls -la /scratch/cpanse/PXD028735/dda/
total 218975360
drwxrwxr-x+ 1 tobiasko SG_Employees 7154 Jul 18 09:52 .
drwxrwxrwx+ 1 cpanse SG_Employees 6756 Sep 22 11:28 ..
-rw-rw-r--+ 1 cpanse SG_Employees 3337 Jul 17 10:29 checmsum.md5
-rw-rw-r--+ 1 tobiasko SG_Employees 3427 Jul 13 15:29 dda.fp-manifest
-rw-rw-r--+ 1 tobiasko SG_Employees 11261 Jul 13 15:31 Default_zero_Oktoberfest.workflow
drwxrwxr-x+ 1 tobiasko SG_Employees 26 Jul 14 09:13 FragPipeOutput
-rw-rw-r--+ 1 root root 1276022114 Jul 14 14:13 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.mzML
-r--r--r--+ 1 cpanse SG_Employees 3459277876 May 10 2022 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.raw
-rw-rw-r--+ 1 root root 1314764144 Jul 14 14:15 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_02.mzML
-r--r--r--+ 1 cpanse SG_Employees 3612100302 May 11 2022 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_02.raw
-rw-rw-r--+ 1 root root 1361457428 Jul 14 14:14 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_03.mzML
-r--r--r--+ 1 cpanse SG_Employees 3701529661 May 10 2022 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_03.raw
-rw-rw-r--+ 1 root root 1370533416 Jul 14 14:15 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_04.mzML
-r--r--r--+ 1 cpanse SG_Employees 3803946051 May 11 2022 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_04.raw
-rw-rw-r--+ 1 root root 1350804551 Jul 14 14:16 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_01.mzML
-r--r--r--+ 1 cpanse SG_Employees 3726871843 May 10 2022 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_01.raw
-rw-rw-r--+ 1 root root 1360132006 Jul 14 14:15 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_02.mzML
-r--r--r--+ 1 cpanse SG_Employees 3744014296 May 12 2022 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_02.raw
-rw-rw-r--+ 1 root root 1324188169 Jul 14 14:14 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_03.mzML
-r--r--r--+ 1 cpanse SG_Employees 3557854702 May 12 2022 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_03.raw
-rw-rw-r--+ 1 root root 1381335766 Jul 14 14:15 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_04.mzML
-r--r--r--+ 1 cpanse SG_Employees 3847093811 May 10 2022 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_04.raw
-rw-rw-r--+ 1 root root 1335672816 Jul 14 14:14 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_01.mzML
-r--r--r--+ 1 cpanse SG_Employees 3663013199 May 10 2022 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_01.raw
-rw-rw-r--+ 1 root root 1371316733 Jul 14 14:14 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_02.mzML
-r--r--r--+ 1 cpanse SG_Employees 3737657800 May 10 2022 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_02.raw
-rw-rw-r--+ 1 root root 1396203946 Jul 14 14:15 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_03.mzML
-r--r--r--+ 1 cpanse SG_Employees 3925017162 May 10 2022 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_03.raw
-rw-rw-r--+ 1 root root 1365540346 Jul 14 14:15 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_04.mzML
-r--r--r--+ 1 cpanse SG_Employees 3774274368 May 12 2022 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_04.raw
-rw-rw-r--+ 1 root root 1290801791 Jul 14 14:14 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_01.mzML
-r--r--r--+ 1 cpanse SG_Employees 3501153502 May 12 2022 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_01.raw
-rw-rw-r--+ 1 root root 1380771948 Jul 14 14:16 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_02.mzML
-r--r--r--+ 1 cpanse SG_Employees 3904127239 May 10 2022 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_02.raw
-rw-rw-r--+ 1 root root 1360576319 Jul 14 14:14 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_03.mzML
-r--r--r--+ 1 cpanse SG_Employees 3693116392 May 12 2022 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_03.raw
-rw-rw-r--+ 1 root root 1374002406 Jul 14 14:13 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_04.mzML
-r--r--r--+ 1 cpanse SG_Employees 3824873805 May 12 2022 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_04.raw
-rw-rw-r--+ 1 root root 1349182218 Jul 14 14:31 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_01.mzML
-r--r--r--+ 1 cpanse SG_Employees 3712861001 May 12 2022 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_01.raw
-rw-rw-r--+ 1 root root 1363783417 Jul 14 14:33 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_02.mzML
-r--r--r--+ 1 cpanse SG_Employees 3723972878 May 10 2022 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_02.raw
-rw-rw-r--+ 1 root root 1365602005 Jul 14 14:33 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_03.mzML
-r--r--r--+ 1 cpanse SG_Employees 3807936631 May 10 2022 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_03.raw
-rw-rw-r--+ 1 root root 1369697566 Jul 14 14:32 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_04.mzML
-r--r--r--+ 1 cpanse SG_Employees 3808610486 May 10 2022 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_04.raw
-rw-rw-r--+ 1 root root 1336687477 Jul 14 14:32 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_01.mzML
-r--r--r--+ 1 cpanse SG_Employees 3679450988 May 11 2022 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_01.raw
-rw-rw-r--+ 1 root root 1375750763 Jul 14 14:33 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_02.mzML
-r--r--r--+ 1 cpanse SG_Employees 3751359283 May 12 2022 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_02.raw
-rw-rw-r--+ 1 root root 1387911522 Jul 14 14:32 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_03.mzML
-r--r--r--+ 1 cpanse SG_Employees 3888135685 May 12 2022 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_03.raw
-rw-rw-r--+ 1 root root 1354050052 Jul 14 14:32 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_04.mzML
-r--r--r--+ 1 cpanse SG_Employees 3737447850 May 12 2022 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_04.raw
-rw-rw-r--+ 1 root root 1000801557 Jul 14 14:27 LFQ_Orbitrap_DDA_Ecoli_01.mzML
-r--r--r--+ 1 cpanse SG_Employees 3228500047 May 11 2022 LFQ_Orbitrap_DDA_Ecoli_01.raw
-rw-rw-r--+ 1 root root 993546580 Jul 14 14:27 LFQ_Orbitrap_DDA_Ecoli_02.mzML
-r--r--r--+ 1 cpanse SG_Employees 3222535657 May 12 2022 LFQ_Orbitrap_DDA_Ecoli_02.raw
-rw-rw-r--+ 1 root root 989534228 Jul 14 14:27 LFQ_Orbitrap_DDA_Ecoli_03.mzML
-r--r--r--+ 1 cpanse SG_Employees 3210727843 May 12 2022 LFQ_Orbitrap_DDA_Ecoli_03.raw
-rw-rw-r--+ 1 root root 1363285349 Jul 14 14:33 LFQ_Orbitrap_DDA_Human_01.mzML
-r--r--r--+ 1 cpanse SG_Employees 3704372427 May 10 2022 LFQ_Orbitrap_DDA_Human_01.raw
-rw-rw-r--+ 1 root root 1334702959 Jul 14 14:33 LFQ_Orbitrap_DDA_Human_02.mzML
-r--r--r--+ 1 cpanse SG_Employees 3616768504 May 10 2022 LFQ_Orbitrap_DDA_Human_02.raw
-rw-rw-r--+ 1 root root 1335722200 Jul 14 14:33 LFQ_Orbitrap_DDA_Human_03.mzML
-r--r--r--+ 1 cpanse SG_Employees 3658873260 May 10 2022 LFQ_Orbitrap_DDA_Human_03.raw
-rw-rw-r--+ 1 root root 1277506970 Jul 14 14:33 LFQ_Orbitrap_DDA_QC_01.mzML
-r--r--r--+ 1 cpanse SG_Employees 3465044037 May 12 2022 LFQ_Orbitrap_DDA_QC_01.raw
-rw-rw-r--+ 1 root root 1326638205 Jul 14 14:34 LFQ_Orbitrap_DDA_QC_02.mzML
-r--r--r--+ 1 cpanse SG_Employees 3646730576 May 12 2022 LFQ_Orbitrap_DDA_QC_02.raw
-rw-rw-r--+ 1 root root 1350655649 Jul 14 14:45 LFQ_Orbitrap_DDA_QC_03.mzML
-r--r--r--+ 1 cpanse SG_Employees 3719296311 May 12 2022 LFQ_Orbitrap_DDA_QC_03.raw
-rw-rw-r--+ 1 root root 1325318217 Jul 14 14:45 LFQ_Orbitrap_DDA_QC_04.mzML
-r--r--r--+ 1 cpanse SG_Employees 3638629986 May 12 2022 LFQ_Orbitrap_DDA_QC_04.raw
-rw-rw-r--+ 1 root root 1373957153 Jul 14 14:47 LFQ_Orbitrap_DDA_QC_05.mzML
-r--r--r--+ 1 cpanse SG_Employees 3848899529 May 12 2022 LFQ_Orbitrap_DDA_QC_05.raw
-rw-rw-r--+ 1 root root 1353831466 Jul 14 14:49 LFQ_Orbitrap_DDA_QC_06.mzML
-r--r--r--+ 1 cpanse SG_Employees 3679452851 May 12 2022 LFQ_Orbitrap_DDA_QC_06.raw
-rw-rw-r--+ 1 root root 1364329047 Jul 14 14:49 LFQ_Orbitrap_DDA_QC_07.mzML
-r--r--r--+ 1 cpanse SG_Employees 3710507404 May 12 2022 LFQ_Orbitrap_DDA_QC_07.raw
-rw-rw-r--+ 1 root root 1359819605 Jul 14 14:50 LFQ_Orbitrap_DDA_QC_08.mzML
-r--r--r--+ 1 cpanse SG_Employees 3694714723 May 10 2022 LFQ_Orbitrap_DDA_QC_08.raw
-rw-rw-r--+ 1 root root 1392466478 Jul 14 14:50 LFQ_Orbitrap_DDA_QC_09.mzML
-r--r--r--+ 1 cpanse SG_Employees 3955392294 May 10 2022 LFQ_Orbitrap_DDA_QC_09.raw
-rw-rw-r--+ 1 root root 1371714129 Jul 14 14:50 LFQ_Orbitrap_DDA_QC_10.mzML
-r--r--r--+ 1 cpanse SG_Employees 3826713756 May 12 2022 LFQ_Orbitrap_DDA_QC_10.raw
-rw-rw-r--+ 1 root root 1373229425 Jul 14 14:51 LFQ_Orbitrap_DDA_QC_11.mzML
-r--r--r--+ 1 cpanse SG_Employees 3828401838 May 10 2022 LFQ_Orbitrap_DDA_QC_11.raw
-rw-rw-r--+ 1 root root 1356128901 Jul 14 14:50 LFQ_Orbitrap_DDA_QC_12.mzML
-r--r--r--+ 1 cpanse SG_Employees 3767945668 May 10 2022 LFQ_Orbitrap_DDA_QC_12.raw
-rw-rw-r--+ 1 root root 1217563461 Jul 14 14:46 LFQ_Orbitrap_DDA_Yeast_01.mzML
-r--r--r--+ 1 cpanse SG_Employees 3275043903 May 10 2022 LFQ_Orbitrap_DDA_Yeast_01.raw
-rw-rw-r--+ 1 root root 1200474584 Jul 14 14:46 LFQ_Orbitrap_DDA_Yeast_02.mzML
-r--r--r--+ 1 cpanse SG_Employees 3259909482 May 12 2022 LFQ_Orbitrap_DDA_Yeast_02.raw
-rw-rw-r--+ 1 root root 1204583659 Jul 14 14:46 LFQ_Orbitrap_DDA_Yeast_03.mzML
-r--r--r--+ 1 cpanse SG_Employees 3303790909 May 10 2022 LFQ_Orbitrap_DDA_Yeast_03.raw
-rw-r--r--+ 1 cpanse SG_Employees 1018 Jul 14 13:56 Makefile
-rwxrw-r--+ 1 tobiasko SG_Employees 228 Jul 14 09:13 runfragpipe.bash
There is actually one pepxml file for each raw file in FragPipe output folder:
ls -la /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/
total 17695832
drwxrwxr-x+ 1 tobiasko SG_Employees 19792 Jul 14 17:13 .
drwxrwxr-x+ 1 tobiasko SG_Employees 26 Jul 14 09:13 ..
-rw-rw-r--+ 1 tobiasko SG_Employees 242722540 Jul 14 17:02 combined.prot.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 5632 Jul 14 14:58 filelist_proteinprophet.txt
-rw-rw-r--+ 1 tobiasko SG_Employees 9651 Jul 14 17:12 filter.log
-rw-rw-r--+ 1 tobiasko SG_Employees 9788 Jul 14 14:58 fragger.params
-rw-rw-r--+ 1 tobiasko SG_Employees 3426 Jul 14 17:13 fragpipe-files.fp-manifest
-rw-rw-r--+ 1 tobiasko SG_Employees 11715 Jul 14 17:13 fragpipe.workflow
-rw-rw-r--+ 1 tobiasko SG_Employees 120419761 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 123579205 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_02.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 130790185 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_03.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 129977302 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_04.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 126921319 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_01.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 126560590 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_02.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 126196830 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_03.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 129856529 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_04.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 125736895 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_01.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 130117333 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_02.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 129414446 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_03.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 128472556 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_04.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 124265328 Jul 14 16:56 interact-LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_01.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 123593038 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_02.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 132080825 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_03.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 131977057 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_04.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 129268296 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_01.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 130518466 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_02.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 126464790 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_03.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 128592545 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_04.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 126860869 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_01.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 131527795 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_02.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 130646835 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_03.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 128423451 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_04.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 30271048 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_Ecoli_01.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 27320573 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_Ecoli_02.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 28257458 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_Ecoli_03.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 136202606 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_Human_01.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 131946385 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_Human_02.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 130463052 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_Human_03.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 121162792 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_QC_01.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 124888914 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_QC_02.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 128168350 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_QC_03.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 125542002 Jul 14 16:56 interact-LFQ_Orbitrap_DDA_QC_04.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 126178860 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_QC_05.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 128525437 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_QC_06.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 129784624 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_QC_07.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 130186307 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_QC_08.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 126886411 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_QC_09.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 129204495 Jul 14 16:56 interact-LFQ_Orbitrap_DDA_QC_10.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 128970846 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_QC_11.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 127577728 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_QC_12.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 74739122 Jul 14 16:57 interact-LFQ_Orbitrap_DDA_Yeast_01.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 72800646 Jul 14 16:59 interact-LFQ_Orbitrap_DDA_Yeast_02.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 72540553 Jul 14 16:58 interact-LFQ_Orbitrap_DDA_Yeast_03.pep.xml
-rw-rw-r--+ 1 tobiasko SG_Employees 25769444 Jul 14 17:12 ion.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 34243596 Jul 14 16:25 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 139837745 Jul 14 15:47 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 30812016 Jul 14 15:47 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 44430693 Jul 14 15:47 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 35114184 Jul 14 16:25 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_02_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 143103363 Jul 14 15:48 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_02.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 31574305 Jul 14 15:48 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_02.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 45385140 Jul 14 15:48 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_02.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 37562392 Jul 14 16:25 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_03_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 153244328 Jul 14 15:48 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_03.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 33796150 Jul 14 15:49 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_03.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48592155 Jul 14 15:48 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_03.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 36997852 Jul 14 16:26 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_04_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 151527906 Jul 14 15:49 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_04.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 33301849 Jul 14 15:49 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_04.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48226090 Jul 14 15:49 LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_04.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 36601149 Jul 14 16:26 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_01_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 149645919 Jul 14 15:50 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_01.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 32909394 Jul 14 15:50 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_01.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 47540095 Jul 14 15:50 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_01.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 36769787 Jul 14 16:26 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_02_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 150111972 Jul 14 15:51 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_02.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 33073217 Jul 14 15:51 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_02.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 47499973 Jul 14 15:51 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_02.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 35967345 Jul 14 16:26 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_03_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 147370928 Jul 14 15:52 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_03.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 32363466 Jul 14 15:52 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_03.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 46780157 Jul 14 15:52 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_03.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 37228422 Jul 14 16:26 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_04_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 152838033 Jul 14 15:52 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_04.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 33496584 Jul 14 15:52 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_04.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48578867 Jul 14 15:52 LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_04.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 36028471 Jul 14 16:27 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_01_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 147046140 Jul 14 15:53 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_01.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 32413233 Jul 14 15:53 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_01.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 46616452 Jul 14 15:53 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_01.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 37791748 Jul 14 16:27 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_02_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 154175872 Jul 14 15:54 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_02.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 34009128 Jul 14 15:54 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_02.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48840393 Jul 14 15:54 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_02.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 37432545 Jul 14 16:27 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_03_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 153361536 Jul 14 15:55 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_03.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 33696784 Jul 14 15:55 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_03.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48742741 Jul 14 15:55 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_03.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 36920841 Jul 14 16:27 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_04_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 151485897 Jul 14 15:55 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_04.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 33242371 Jul 14 15:56 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_04.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48197550 Jul 14 15:56 LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_04.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 35318445 Jul 14 16:27 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_01_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 144915903 Jul 14 15:56 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_01.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 31818856 Jul 14 15:56 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_01.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 46157493 Jul 14 15:56 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_01.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 36178440 Jul 14 16:27 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_02_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 148415804 Jul 14 15:57 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_02.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 32606188 Jul 14 15:57 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_02.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 47159197 Jul 14 15:57 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_02.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 37758105 Jul 14 16:28 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_03_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 154835341 Jul 14 15:58 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_03.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 34008320 Jul 14 15:58 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_03.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 49247326 Jul 14 15:58 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_03.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 37267197 Jul 14 16:28 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_04_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 153586389 Jul 14 15:59 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_04.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 33580391 Jul 14 15:59 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_04.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 49070473 Jul 14 15:59 LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_04.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 36768331 Jul 14 16:28 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_01_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 151038624 Jul 14 16:00 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_01.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 33094161 Jul 14 16:00 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_01.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48079943 Jul 14 16:00 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_01.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 37426462 Jul 14 16:28 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_02_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 153710400 Jul 14 16:00 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_02.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 33700107 Jul 14 16:00 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_02.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48835061 Jul 14 16:00 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_02.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 36535152 Jul 14 16:28 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_03_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 150820970 Jul 14 16:01 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_03.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 32924430 Jul 14 16:01 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_03.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48073001 Jul 14 16:01 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_03.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 36919102 Jul 14 16:29 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_04_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 152269809 Jul 14 16:01 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_04.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 33255364 Jul 14 16:01 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_04.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48497050 Jul 14 16:01 LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_04.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 36023504 Jul 14 16:29 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_01_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 147615582 Jul 14 16:02 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_01.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 32429648 Jul 14 16:02 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_01.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 46958191 Jul 14 16:02 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_01.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 38076620 Jul 14 16:29 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_02_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 155899367 Jul 14 16:03 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_02.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 34295494 Jul 14 16:03 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_02.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 49496597 Jul 14 16:03 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_02.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 37501902 Jul 14 16:29 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_03_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 154427209 Jul 14 16:03 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_03.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 33798848 Jul 14 16:03 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_03.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 49287792 Jul 14 16:03 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_03.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 36643148 Jul 14 16:29 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_04_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 150921871 Jul 14 16:04 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_04.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 33023871 Jul 14 16:04 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_04.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48133957 Jul 14 16:04 LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_04.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 12572235 Jul 14 16:29 LFQ_Orbitrap_DDA_Ecoli_01_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 49425811 Jul 14 16:04 LFQ_Orbitrap_DDA_Ecoli_01.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 11155807 Jul 14 16:04 LFQ_Orbitrap_DDA_Ecoli_01.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 14094687 Jul 14 16:04 LFQ_Orbitrap_DDA_Ecoli_01.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 11086553 Jul 14 16:29 LFQ_Orbitrap_DDA_Ecoli_02_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 43600238 Jul 14 16:05 LFQ_Orbitrap_DDA_Ecoli_02.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 9838225 Jul 14 16:05 LFQ_Orbitrap_DDA_Ecoli_02.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 12434222 Jul 14 16:05 LFQ_Orbitrap_DDA_Ecoli_02.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 11963646 Jul 14 16:29 LFQ_Orbitrap_DDA_Ecoli_03_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 47058253 Jul 14 16:05 LFQ_Orbitrap_DDA_Ecoli_03.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 10622109 Jul 14 16:05 LFQ_Orbitrap_DDA_Ecoli_03.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 13431699 Jul 14 16:05 LFQ_Orbitrap_DDA_Ecoli_03.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 36370917 Jul 14 16:30 LFQ_Orbitrap_DDA_Human_01_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 159003949 Jul 14 16:06 LFQ_Orbitrap_DDA_Human_01.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 32664218 Jul 14 16:06 LFQ_Orbitrap_DDA_Human_01.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 52167446 Jul 14 16:06 LFQ_Orbitrap_DDA_Human_01.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 35152179 Jul 14 16:30 LFQ_Orbitrap_DDA_Human_02_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 153616033 Jul 14 16:06 LFQ_Orbitrap_DDA_Human_02.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 31576632 Jul 14 16:07 LFQ_Orbitrap_DDA_Human_02.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 50379733 Jul 14 16:07 LFQ_Orbitrap_DDA_Human_02.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 34986669 Jul 14 16:30 LFQ_Orbitrap_DDA_Human_03_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 152896234 Jul 14 16:07 LFQ_Orbitrap_DDA_Human_03.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 31423251 Jul 14 16:07 LFQ_Orbitrap_DDA_Human_03.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 50143378 Jul 14 16:07 LFQ_Orbitrap_DDA_Human_03.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 32331099 Jul 14 16:30 LFQ_Orbitrap_DDA_QC_01_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 139331524 Jul 14 16:08 LFQ_Orbitrap_DDA_QC_01.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 28877457 Jul 14 16:08 LFQ_Orbitrap_DDA_QC_01.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 45067276 Jul 14 16:08 LFQ_Orbitrap_DDA_QC_01.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 33649871 Jul 14 16:30 LFQ_Orbitrap_DDA_QC_02_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 144634974 Jul 14 16:08 LFQ_Orbitrap_DDA_QC_02.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 30048458 Jul 14 16:08 LFQ_Orbitrap_DDA_QC_02.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 46697222 Jul 14 16:08 LFQ_Orbitrap_DDA_QC_02.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 34378719 Jul 14 16:30 LFQ_Orbitrap_DDA_QC_03_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 147829013 Jul 14 16:09 LFQ_Orbitrap_DDA_QC_03.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 30698284 Jul 14 16:09 LFQ_Orbitrap_DDA_QC_03.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 47749170 Jul 14 16:09 LFQ_Orbitrap_DDA_QC_03.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 33382876 Jul 14 16:31 LFQ_Orbitrap_DDA_QC_04_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 143630293 Jul 14 16:09 LFQ_Orbitrap_DDA_QC_04.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 29802908 Jul 14 16:09 LFQ_Orbitrap_DDA_QC_04.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 46387486 Jul 14 16:09 LFQ_Orbitrap_DDA_QC_04.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 34161295 Jul 14 16:31 LFQ_Orbitrap_DDA_QC_05_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 146825523 Jul 14 16:10 LFQ_Orbitrap_DDA_QC_05.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 30511701 Jul 14 16:10 LFQ_Orbitrap_DDA_QC_05.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 47342807 Jul 14 16:10 LFQ_Orbitrap_DDA_QC_05.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 34878374 Jul 14 16:31 LFQ_Orbitrap_DDA_QC_06_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 150172472 Jul 14 16:11 LFQ_Orbitrap_DDA_QC_06.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 31163932 Jul 14 16:11 LFQ_Orbitrap_DDA_QC_06.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48512357 Jul 14 16:11 LFQ_Orbitrap_DDA_QC_06.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 35152489 Jul 14 16:31 LFQ_Orbitrap_DDA_QC_07_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 151155667 Jul 14 16:11 LFQ_Orbitrap_DDA_QC_07.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 31398395 Jul 14 16:11 LFQ_Orbitrap_DDA_QC_07.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48750033 Jul 14 16:11 LFQ_Orbitrap_DDA_QC_07.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 35057172 Jul 14 16:31 LFQ_Orbitrap_DDA_QC_08_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 150841948 Jul 14 16:12 LFQ_Orbitrap_DDA_QC_08.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 31313604 Jul 14 16:12 LFQ_Orbitrap_DDA_QC_08.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48673077 Jul 14 16:12 LFQ_Orbitrap_DDA_QC_08.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 34307597 Jul 14 16:31 LFQ_Orbitrap_DDA_QC_09_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 148249915 Jul 14 16:13 LFQ_Orbitrap_DDA_QC_09.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 30667400 Jul 14 16:13 LFQ_Orbitrap_DDA_QC_09.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 47988959 Jul 14 16:13 LFQ_Orbitrap_DDA_QC_09.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 34496062 Jul 14 16:32 LFQ_Orbitrap_DDA_QC_10_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 149108875 Jul 14 16:14 LFQ_Orbitrap_DDA_QC_10.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 30825335 Jul 14 16:14 LFQ_Orbitrap_DDA_QC_10.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48303712 Jul 14 16:14 LFQ_Orbitrap_DDA_QC_10.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 34561085 Jul 14 16:32 LFQ_Orbitrap_DDA_QC_11_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 149365768 Jul 14 16:15 LFQ_Orbitrap_DDA_QC_11.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 30882349 Jul 14 16:15 LFQ_Orbitrap_DDA_QC_11.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 48380501 Jul 14 16:15 LFQ_Orbitrap_DDA_QC_11.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 34068907 Jul 14 16:32 LFQ_Orbitrap_DDA_QC_12_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 147331123 Jul 14 16:16 LFQ_Orbitrap_DDA_QC_12.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 30445880 Jul 14 16:16 LFQ_Orbitrap_DDA_QC_12.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 47744298 Jul 14 16:16 LFQ_Orbitrap_DDA_QC_12.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 26763710 Jul 14 16:32 LFQ_Orbitrap_DDA_Yeast_01_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 106844595 Jul 14 16:17 LFQ_Orbitrap_DDA_Yeast_01.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 23699257 Jul 14 16:17 LFQ_Orbitrap_DDA_Yeast_01.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 31861797 Jul 14 16:17 LFQ_Orbitrap_DDA_Yeast_01.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 26045696 Jul 14 16:32 LFQ_Orbitrap_DDA_Yeast_02_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 104064965 Jul 14 16:17 LFQ_Orbitrap_DDA_Yeast_02.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 23064377 Jul 14 16:17 LFQ_Orbitrap_DDA_Yeast_02.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 31080830 Jul 14 16:17 LFQ_Orbitrap_DDA_Yeast_02.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 25783290 Jul 14 16:32 LFQ_Orbitrap_DDA_Yeast_03_edited.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 102914665 Jul 14 16:18 LFQ_Orbitrap_DDA_Yeast_03.pepXML
-rw-rw-r--+ 1 tobiasko SG_Employees 22825700 Jul 14 16:18 LFQ_Orbitrap_DDA_Yeast_03.pin
-rw-rw-r--+ 1 tobiasko SG_Employees 30705440 Jul 14 16:18 LFQ_Orbitrap_DDA_Yeast_03.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 156767 Jul 14 09:14 log_2023-07-14_09-14-13.txt
-rw-rw-r--+ 1 tobiasko SG_Employees 2180 Jul 14 14:58 msbooster_params.txt
drwxrwxr-x+ 1 tobiasko SG_Employees 3434 Jul 14 16:32 MSBooster_RTplots
-rw-rw-r--+ 1 tobiasko SG_Employees 16866052 Jul 14 17:12 peptide.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 6792707 Jul 14 17:12 protein.fas
-rw-rw-r--+ 1 tobiasko SG_Employees 2618965 Jul 14 17:12 protein.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 1217193542 Jul 14 17:13 psm.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 15056686 Jul 14 16:20 spectraRT_full.tsv
-rw-rw-r--+ 1 tobiasko SG_Employees 98019792 Jul 14 16:25 spectraRT.predicted.bin
-rw-rw-r--+ 1 tobiasko SG_Employees 15533157 Jul 14 16:20 spectraRT.tsv
but I provided only one in the config file. Why exactly is this a problem? Shouldn't the search result determine which raw files needs to be found?
not found, as expected due to .pepxml file coverage:
grep LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_02 /scratch/tobiasko/msms/msms.prosit
I changed the source to a single file:
head /home/tobiasko/CEcalibration_config.json
{
"type": "CollisionEnergyCalibration",
"tag": "",
"allFeatures": false,
"inputs": {
"search_results_type": "Msfragger",
"spectra": "/scratch/cpanse/PXD028735/dda/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.mzML",
"spectra_type": "mzml",
"search_results": "/scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.pepXML"
},
and get a different error:
python3 -m oktoberfest --config_path ~/CEcalibration_config.json
2023-10-04 15:13:48,752 - INFO - oktoberfest::main Oktoberfest version 0.5.0
Copyright 2023, Wilhelmlab at Technical University of Munich
2023-10-04 15:13:48,753 - INFO - oktoberfest.utils.config::read Reading configuration from /home/tobiasko/CEcalibration_config.json
2023-10-04 15:13:48,754 - INFO - oktoberfest.utils.config::read Reading configuration from /home/tobiasko/CEcalibration_config.json
2023-10-04 15:13:48,755 - INFO - oktoberfest.runner::run_ce_calibration Found 1 files in the spectra directory.
2023-10-04 15:13:48,755 - INFO - oktoberfest.runner::_preprocess Converting search results from /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.pepXML to internal search result.
2023-10-04 15:13:48,755 - INFO - spectrum_io.search_result.search_results::generate_internal Found search results in internal format at /scratch/tobiasko/msms/msms.prosit, skipping conversion
2023-10-04 15:13:48,876 - INFO - oktoberfest.runner::_preprocess Read 98622 PSMs from /scratch/tobiasko/msms/msms.prosit
2023-10-04 15:13:48,984 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/tobiasko/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.rescore
Waiting for tasks to complete: 0%| | 0/1 [00:00<?, ?it/s]2023-10-04 15:13:49,442 - INFO - spectrum_io.raw.msraw::read_mzml Reading mzML file: /scratch/cpanse/PXD028735/dda/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.mzML
Waiting for tasks to complete: 0%| | 0/1 [00:02<?, ?it/s]
2023-10-04 15:13:51,932 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool Caught Unknown exception, terminating workers
2023-10-04 15:13:51,932 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool Caught Unknown exception, terminating workers
2023-10-04 15:13:51,933 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 199, in _ce_calib
library = _annotate_and_get_library(spectra_file, config)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 67, in _annotate_and_get_library
spectra = pp.load_spectra(mzml_file)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessing.py", line 372, in load_spectra
return ThermoRaw.read_mzml(
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/spectrum_io/raw/msraw.py", line 154, in read_mzml
instrument_configuration_ref = spec["scanList"]["scan"][0]["instrumentConfigurationRef"]
KeyError: 'instrumentConfigurationRef'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/utils/multiprocessing_pool.py", line 43, in check_pool
outputs.append(res.get(timeout=10000)) # 10000 seconds = ~3 hours
File "/usr/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
KeyError: 'instrumentConfigurationRef'
2023-10-04 15:13:51,933 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 199, in _ce_calib
library = _annotate_and_get_library(spectra_file, config)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 67, in _annotate_and_get_library
spectra = pp.load_spectra(mzml_file)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessing.py", line 372, in load_spectra
return ThermoRaw.read_mzml(
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/spectrum_io/raw/msraw.py", line 154, in read_mzml
instrument_configuration_ref = spec["scanList"]["scan"][0]["instrumentConfigurationRef"]
KeyError: 'instrumentConfigurationRef'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/utils/multiprocessing_pool.py", line 43, in check_pool
outputs.append(res.get(timeout=10000)) # 10000 seconds = ~3 hours
File "/usr/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
KeyError: 'instrumentConfigurationRef'
2023-10-04 15:13:51,934 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool 'instrumentConfigurationRef'
2023-10-04 15:13:51,934 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool 'instrumentConfigurationRef'
but I provided only one in the config file. Why exactly is this a problem? Shouldn't the search result determine which raw files needs to be found?
Yes indeed, it shouldn't raise a key error. This is an inconvenience that I will address by printing a warning instead of raising an error. If you want to include all files in the msms.prosit, you can also provide the folder instead of one pepXML file, and Oktoberferst wil include all pepXML files contained in the folder and all subfolders then.
For now, you have solved this by explictely providing one raw file.
and get a different error:
The mzML file should contain a header that defines a list of instrument types for each MS level. Each spectra is then using a reference "instrumentConfigurationRef" that defines which instrument was used to measure it. It seems your mzML file does not have this. It may be that this is not a standard. We convert all our raw files with ThermoRawFileParser and then it works but you said you are using MSConvert. It seems the output isn't consistent between the tools which is of course a shame :( If possible, please check the mzML file and help me finding out about the differences between the tool, then I would be able to add support for MSConvert mzML format in the future.
I will add a check if the instrument reference is provided and if not, rely on the user providing mass tolerance and unit.
Can you maybe send me an email with one of the pepXML + corresponding mzML file? Would be helpful to debug this for this particular case.
I would honestly say that if anything in the mzML space def. a standard or reference implementation, it is msconvert/the proteowizard project - not the rawfileparser. There is meanwhile a containerized version that also works on Linux, see here. github This is what we are using on our HPC nodes to convert raw files. Would that also be a solution for oktoberfest?
mzML
makes me 🤢🤬💀🤯 !
But do I get this right: In the end all you want is to extract ~1000 MS2 scans from a raw file to do the spectral angle/similarity calculation and because you can't request the peak lists of these scans selectively using from sort of API you need to convert the the complete file to mzML
. Welcome to the future! ;-)
Ok, I think I found a fix, and I used that one mzML and performed CECalibration and Rescoring with it. The rescoring results on peptide level and spectral angle for the tested CEs are below, so it definitely works now. I have pushed the fix to the fix/mzml_instrumentConfigurationRef branch of spectrum-io, you can install it using pip install git+https://github.com/wilhelm-lab/spectrum_io.git#fix/mzml_instrumentConfigurationRef for now, until I release this.
But do I get this right: In the end all you want is to extract ~1000 MS2 scans from a raw file to do the spectral angle/similarity calculation and because you can't request the peak lists of these scans selectively using from sort of API you need to convert the the complete file to
mzML
. Welcome to the future! ;-)
Yes, for CE calibration, we currently take the top 1000 scoring target PSMs, so unfortunately, we need to read all of them and then sort by score. If you know a better way of doing this, please let me know :)
My comment was more about the conversion of 99% of the scan data (the raw file) that is afterwards anyway not needed. If one could selectively request those ~1000 from the binary file without a linear read access...it would save so much time and computation...
Nice plots! ok. will try to update spectrum-io!
My comment was more about the conversion of 99% of the scan data (the raw file) that is afterwards anyway not needed. If one could selectively request those ~1000 from the binary file without a linear read access...it would save so much time and computation...
Yes, good idea. Especially in situtation of many raw files. This does apparently work with ThermoRawFileParser by providing the scannumbers you want to extract but it would require many changes. I created an issue for that: https://github.com/wilhelm-lab/oktoberfest/issues/135
I updated by
pip install git+https://github.com/wilhelm-lab/spectrum_io.git#fix/mzml_instrumentConfigurationRef
Collecting git+https://github.com/wilhelm-lab/spectrum_io.git#fix/mzml_instrumentConfigurationRef
Cloning https://github.com/wilhelm-lab/spectrum_io.git to /tmp/pip-req-build-s3u27zjj
Running command git clone -q https://github.com/wilhelm-lab/spectrum_io.git /tmp/pip-req-build-s3u27zjj
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Requirement already satisfied: click>=8.0.0 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (8.1.7)
Requirement already satisfied: h5py<4.0.0,>=3.1.0 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (3.9.0)
Requirement already satisfied: tables<4.0.0,>=3.6.1 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (3.8.0)
Requirement already satisfied: pymzml<3.0.0,>=2.5.0 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (2.5.2)
Requirement already satisfied: PyYAML>=5.4.1 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (6.0.1)
Requirement already satisfied: pandas<2.0.0,>=1.3.0 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (1.5.3)
Requirement already satisfied: spectrum-fundamentals<0.5.0,>=0.4.3 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (0.4.3)
Requirement already satisfied: lxml<5.0.0,>=4.5.2 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (4.9.3)
Requirement already satisfied: pyteomics<5.0.0,>=4.3.3 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (4.6.2)
Requirement already satisfied: rich>=10.3.0 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (13.5.3)
Requirement already satisfied: numpy<2.0.0,>=1.18.1 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (1.24.4)
Requirement already satisfied: python-dateutil>=2.8.1 in ./oktoberfest-env/lib/python3.9/site-packages (from pandas<2.0.0,>=1.3.0->spectrum_io==0.3.3) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in ./oktoberfest-env/lib/python3.9/site-packages (from pandas<2.0.0,>=1.3.0->spectrum_io==0.3.3) (2023.3.post1)
Requirement already satisfied: regex in ./oktoberfest-env/lib/python3.9/site-packages (from pymzml<3.0.0,>=2.5.0->spectrum_io==0.3.3) (2023.8.8)
Requirement already satisfied: six>=1.5 in ./oktoberfest-env/lib/python3.9/site-packages (from python-dateutil>=2.8.1->pandas<2.0.0,>=1.3.0->spectrum_io==0.3.3) (1.16.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in ./oktoberfest-env/lib/python3.9/site-packages (from rich>=10.3.0->spectrum_io==0.3.3) (2.16.1)
Requirement already satisfied: markdown-it-py>=2.2.0 in ./oktoberfest-env/lib/python3.9/site-packages (from rich>=10.3.0->spectrum_io==0.3.3) (3.0.0)
Requirement already satisfied: mdurl~=0.1 in ./oktoberfest-env/lib/python3.9/site-packages (from markdown-it-py>=2.2.0->rich>=10.3.0->spectrum_io==0.3.3) (0.1.2)
Requirement already satisfied: scikit-learn<2.0,>=1.0 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (1.3.1)
Requirement already satisfied: moepy<2.0.0,>=1.1.4 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (1.1.4)
Requirement already satisfied: joblib<2.0.0,>=1.0.1 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (1.3.2)
Requirement already satisfied: matplotlib>=3.3.3 in ./oktoberfest-env/lib/python3.9/site-packages (from moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (3.8.0)
Requirement already satisfied: scipy>=1.6.0 in ./oktoberfest-env/lib/python3.9/site-packages (from moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (1.11.2)
Requirement already satisfied: tqdm>=4.59.0 in ./oktoberfest-env/lib/python3.9/site-packages (from moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (4.66.1)
Requirement already satisfied: cycler>=0.10 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (0.11.0)
Requirement already satisfied: pyparsing>=2.3.1 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (3.1.1)
Requirement already satisfied: fonttools>=4.22.0 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (4.42.1)
Requirement already satisfied: importlib-resources>=3.2.0 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (6.1.0)
Requirement already satisfied: kiwisolver>=1.0.1 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (1.4.5)
Requirement already satisfied: contourpy>=1.0.1 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (1.1.1)
Requirement already satisfied: pillow>=6.2.0 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (10.0.1)
Requirement already satisfied: packaging>=20.0 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (23.1)
Requirement already satisfied: zipp>=3.1.0 in ./oktoberfest-env/lib/python3.9/site-packages (from importlib-resources>=3.2.0->matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (3.17.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in ./oktoberfest-env/lib/python3.9/site-packages (from scikit-learn<2.0,>=1.0->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (3.2.0)
Requirement already satisfied: py-cpuinfo in ./oktoberfest-env/lib/python3.9/site-packages (from tables<4.0.0,>=3.6.1->spectrum_io==0.3.3) (9.0.0)
Requirement already satisfied: blosc2~=2.0.0 in ./oktoberfest-env/lib/python3.9/site-packages (from tables<4.0.0,>=3.6.1->spectrum_io==0.3.3) (2.0.0)
Requirement already satisfied: numexpr>=2.6.2 in ./oktoberfest-env/lib/python3.9/site-packages (from tables<4.0.0,>=3.6.1->spectrum_io==0.3.3) (2.8.6)
Requirement already satisfied: cython>=0.29.21 in ./oktoberfest-env/lib/python3.9/site-packages (from tables<4.0.0,>=3.6.1->spectrum_io==0.3.3) (3.0.2)
Requirement already satisfied: msgpack in ./oktoberfest-env/lib/python3.9/site-packages (from blosc2~=2.0.0->tables<4.0.0,>=3.6.1->spectrum_io==0.3.3) (1.0.6)
but the erro does not disappear:
python3 -m oktoberfest --config_path ~/CEcalibration_config.json
2023-10-05 15:22:45,816 - INFO - oktoberfest::main Oktoberfest version 0.5.0
Copyright 2023, Wilhelmlab at Technical University of Munich
2023-10-05 15:22:45,816 - INFO - oktoberfest.utils.config::read Reading configuration from /home/tobiasko/CEcalibration_config.json
2023-10-05 15:22:45,817 - INFO - oktoberfest.utils.config::read Reading configuration from /home/tobiasko/CEcalibration_config.json
2023-10-05 15:22:45,817 - INFO - oktoberfest.runner::run_ce_calibration Found 1 files in the spectra directory.
2023-10-05 15:22:45,817 - INFO - oktoberfest.utils.process_step::is_done Skipping preprocessing_search step because /scratch/tobiasko/proc/preprocessing_search.done was found.
Waiting for tasks to complete: 0%| | 0/1 [00:00<?, ?it/s]2023-10-05 15:22:45,833 - INFO - spectrum_io.raw.msraw::read_mzml Reading mzML file: /scratch/cpanse/PXD028735/dda/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.mzML
Waiting for tasks to complete: 0%| | 0/1 [00:03<?, ?it/s]
2023-10-05 15:22:48,872 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool Caught Unknown exception, terminating workers
2023-10-05 15:22:48,872 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool Caught Unknown exception, terminating workers
2023-10-05 15:22:48,874 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 199, in _ce_calib
library = _annotate_and_get_library(spectra_file, config)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 67, in _annotate_and_get_library
spectra = pp.load_spectra(mzml_file)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessing.py", line 372, in load_spectra
return ThermoRaw.read_mzml(
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/spectrum_io/raw/msraw.py", line 154, in read_mzml
instrument_configuration_ref = spec["scanList"]["scan"][0]["instrumentConfigurationRef"]
KeyError: 'instrumentConfigurationRef'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/utils/multiprocessing_pool.py", line 43, in check_pool
outputs.append(res.get(timeout=10000)) # 10000 seconds = ~3 hours
File "/usr/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
KeyError: 'instrumentConfigurationRef'
2023-10-05 15:22:48,874 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 199, in _ce_calib
library = _annotate_and_get_library(spectra_file, config)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 67, in _annotate_and_get_library
spectra = pp.load_spectra(mzml_file)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessing.py", line 372, in load_spectra
return ThermoRaw.read_mzml(
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/spectrum_io/raw/msraw.py", line 154, in read_mzml
instrument_configuration_ref = spec["scanList"]["scan"][0]["instrumentConfigurationRef"]
KeyError: 'instrumentConfigurationRef'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/utils/multiprocessing_pool.py", line 43, in check_pool
outputs.append(res.get(timeout=10000)) # 10000 seconds = ~3 hours
File "/usr/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
KeyError: 'instrumentConfigurationRef'
2023-10-05 15:22:48,874 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool 'instrumentConfigurationRef'
2023-10-05 15:22:48,874 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool 'instrumentConfigurationRef'
What am I doing wrong?
Try again but instead of "#" write "@", i.e. pip install git+https://github.com/wilhelm-lab/spectrum_io.git@fix/mzml_instrumentConfigurationRef
Check while installing, that the log output specifically states it is checking out that branch.
Sometimes one also needs to first uninstall the package for whatever reason...
no difference.
pip install git+https://github.com/wilhelm-lab/spectrum_io.git@fix/mzml_instrumentConfigurationRef
Collecting git+https://github.com/wilhelm-lab/spectrum_io.git@fix/mzml_instrumentConfigurationRef
Cloning https://github.com/wilhelm-lab/spectrum_io.git (to revision fix/mzml_instrumentConfigurationRef) to /tmp/pip-req-build-8q68fh98
Running command git clone -q https://github.com/wilhelm-lab/spectrum_io.git /tmp/pip-req-build-8q68fh98
Running command git checkout -b fix/mzml_instrumentConfigurationRef --track origin/fix/mzml_instrumentConfigurationRef
Switched to a new branch 'fix/mzml_instrumentConfigurationRef'
Branch 'fix/mzml_instrumentConfigurationRef' set up to track remote branch 'fix/mzml_instrumentConfigurationRef' from 'origin'.
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Requirement already satisfied: spectrum-fundamentals<0.5.0,>=0.4.3 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (0.4.3)
Requirement already satisfied: click>=8.0.0 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (8.1.7)
Requirement already satisfied: lxml<5.0.0,>=4.5.2 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (4.9.3)
Requirement already satisfied: rich>=10.3.0 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (13.5.3)
Requirement already satisfied: numpy<2.0.0,>=1.18.1 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (1.24.4)
Requirement already satisfied: PyYAML>=5.4.1 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (6.0.1)
Requirement already satisfied: pyteomics<5.0.0,>=4.3.3 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (4.6.2)
Requirement already satisfied: tables<4.0.0,>=3.6.1 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (3.8.0)
Requirement already satisfied: pandas<2.0.0,>=1.3.0 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (1.5.3)
Requirement already satisfied: pymzml<3.0.0,>=2.5.0 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (2.5.2)
Requirement already satisfied: h5py<4.0.0,>=3.1.0 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum_io==0.3.3) (3.9.0)
Requirement already satisfied: python-dateutil>=2.8.1 in ./oktoberfest-env/lib/python3.9/site-packages (from pandas<2.0.0,>=1.3.0->spectrum_io==0.3.3) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in ./oktoberfest-env/lib/python3.9/site-packages (from pandas<2.0.0,>=1.3.0->spectrum_io==0.3.3) (2023.3.post1)
Requirement already satisfied: regex in ./oktoberfest-env/lib/python3.9/site-packages (from pymzml<3.0.0,>=2.5.0->spectrum_io==0.3.3) (2023.8.8)
Requirement already satisfied: six>=1.5 in ./oktoberfest-env/lib/python3.9/site-packages (from python-dateutil>=2.8.1->pandas<2.0.0,>=1.3.0->spectrum_io==0.3.3) (1.16.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in ./oktoberfest-env/lib/python3.9/site-packages (from rich>=10.3.0->spectrum_io==0.3.3) (2.16.1)
Requirement already satisfied: markdown-it-py>=2.2.0 in ./oktoberfest-env/lib/python3.9/site-packages (from rich>=10.3.0->spectrum_io==0.3.3) (3.0.0)
Requirement already satisfied: mdurl~=0.1 in ./oktoberfest-env/lib/python3.9/site-packages (from markdown-it-py>=2.2.0->rich>=10.3.0->spectrum_io==0.3.3) (0.1.2)
Requirement already satisfied: joblib<2.0.0,>=1.0.1 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (1.3.2)
Requirement already satisfied: scikit-learn<2.0,>=1.0 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (1.3.1)
Requirement already satisfied: moepy<2.0.0,>=1.1.4 in ./oktoberfest-env/lib/python3.9/site-packages (from spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (1.1.4)
Requirement already satisfied: tqdm>=4.59.0 in ./oktoberfest-env/lib/python3.9/site-packages (from moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (4.66.1)
Requirement already satisfied: matplotlib>=3.3.3 in ./oktoberfest-env/lib/python3.9/site-packages (from moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (3.8.0)
Requirement already satisfied: scipy>=1.6.0 in ./oktoberfest-env/lib/python3.9/site-packages (from moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (1.11.2)
Requirement already satisfied: cycler>=0.10 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (0.11.0)
Requirement already satisfied: packaging>=20.0 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (23.1)
Requirement already satisfied: contourpy>=1.0.1 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (1.1.1)
Requirement already satisfied: importlib-resources>=3.2.0 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (6.1.0)
Requirement already satisfied: fonttools>=4.22.0 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (4.42.1)
Requirement already satisfied: pyparsing>=2.3.1 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (3.1.1)
Requirement already satisfied: kiwisolver>=1.0.1 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (1.4.5)
Requirement already satisfied: pillow>=6.2.0 in ./oktoberfest-env/lib/python3.9/site-packages (from matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (10.0.1)
Requirement already satisfied: zipp>=3.1.0 in ./oktoberfest-env/lib/python3.9/site-packages (from importlib-resources>=3.2.0->matplotlib>=3.3.3->moepy<2.0.0,>=1.1.4->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (3.17.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in ./oktoberfest-env/lib/python3.9/site-packages (from scikit-learn<2.0,>=1.0->spectrum-fundamentals<0.5.0,>=0.4.3->spectrum_io==0.3.3) (3.2.0)
Requirement already satisfied: py-cpuinfo in ./oktoberfest-env/lib/python3.9/site-packages (from tables<4.0.0,>=3.6.1->spectrum_io==0.3.3) (9.0.0)
Requirement already satisfied: cython>=0.29.21 in ./oktoberfest-env/lib/python3.9/site-packages (from tables<4.0.0,>=3.6.1->spectrum_io==0.3.3) (3.0.2)
Requirement already satisfied: numexpr>=2.6.2 in ./oktoberfest-env/lib/python3.9/site-packages (from tables<4.0.0,>=3.6.1->spectrum_io==0.3.3) (2.8.6)
Requirement already satisfied: blosc2~=2.0.0 in ./oktoberfest-env/lib/python3.9/site-packages (from tables<4.0.0,>=3.6.1->spectrum_io==0.3.3) (2.0.0)
Requirement already satisfied: msgpack in ./oktoberfest-env/lib/python3.9/site-packages (from blosc2~=2.0.0->tables<4.0.0,>=3.6.1->spectrum_io==0.3.3) (1.0.6)
python3 -m oktoberfest --config_path ~/CEcalibration_config.json
2023-10-05 16:15:39,182 - INFO - oktoberfest::main Oktoberfest version 0.5.0
Copyright 2023, Wilhelmlab at Technical University of Munich
2023-10-05 16:15:39,183 - INFO - oktoberfest.utils.config::read Reading configuration from /home/tobiasko/CEcalibration_config.json
2023-10-05 16:15:39,184 - INFO - oktoberfest.utils.config::read Reading configuration from /home/tobiasko/CEcalibration_config.json
2023-10-05 16:15:39,184 - INFO - oktoberfest.runner::run_ce_calibration Found 1 files in the spectra directory.
2023-10-05 16:15:39,185 - INFO - oktoberfest.utils.process_step::is_done Skipping preprocessing_search step because /scratch/tobiasko/proc/preprocessing_search.done was found.
Waiting for tasks to complete: 0%| | 0/1 [00:00<?, ?it/s]2023-10-05 16:15:39,202 - INFO - spectrum_io.raw.msraw::read_mzml Reading mzML file: /scratch/cpanse/PXD028735/dda/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.mzML
Waiting for tasks to complete: 0%| | 0/1 [00:03<?, ?it/s]
2023-10-05 16:15:42,388 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool Caught Unknown exception, terminating workers
2023-10-05 16:15:42,388 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool Caught Unknown exception, terminating workers
2023-10-05 16:15:42,389 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 199, in _ce_calib
library = _annotate_and_get_library(spectra_file, config)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 67, in _annotate_and_get_library
spectra = pp.load_spectra(mzml_file)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessing.py", line 372, in load_spectra
return ThermoRaw.read_mzml(
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/spectrum_io/raw/msraw.py", line 154, in read_mzml
instrument_configuration_ref = spec["scanList"]["scan"][0]["instrumentConfigurationRef"]
KeyError: 'instrumentConfigurationRef'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/utils/multiprocessing_pool.py", line 43, in check_pool
outputs.append(res.get(timeout=10000)) # 10000 seconds = ~3 hours
File "/usr/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
KeyError: 'instrumentConfigurationRef'
2023-10-05 16:15:42,389 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 199, in _ce_calib
library = _annotate_and_get_library(spectra_file, config)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 67, in _annotate_and_get_library
spectra = pp.load_spectra(mzml_file)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessing.py", line 372, in load_spectra
return ThermoRaw.read_mzml(
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/spectrum_io/raw/msraw.py", line 154, in read_mzml
instrument_configuration_ref = spec["scanList"]["scan"][0]["instrumentConfigurationRef"]
KeyError: 'instrumentConfigurationRef'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/utils/multiprocessing_pool.py", line 43, in check_pool
outputs.append(res.get(timeout=10000)) # 10000 seconds = ~3 hours
File "/usr/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
KeyError: 'instrumentConfigurationRef'
2023-10-05 16:15:42,389 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool 'instrumentConfigurationRef'
2023-10-05 16:15:42,389 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool 'instrumentConfigurationRef'
not sure why... I check your commit 0ed85b8 in spectrum_io and on my local system it is still the old code (looked at line 154 in msraw.py)
Gotta love pip... Ok, try pip uninstall spectrum-io
, then repeat the git install from this branch. I had trouble with this before and I don't know why that is but it is an issue with pip, because the change is definitely there in this branch.
BAAM! Looks like it works.
And the answer to life, the universe, and everything is
cat /scratch/tobiasko/results/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01_ce.txt
31
and not 42! 😂
Very cool. Thanks a lot for your fast help. Now I can do this for all the files. Does multithreading help for this workflow?
Yes it does, parallel processing is realised on the file level, The shared msms.prosit is split by rawfile and the entire annotation and prediction is performed in parallel then. I.e. use as many processes as you have files with the numThreads
option in the config.
Very cool, 46 files done!
nl /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/results/*.txt
1 31
2 31
3 30
4 31
5 31
6 31
7 31
8 30
9 31
10 31
11 31
12 30
13 31
14 31
15 31
16 31
17 31
18 31
19 31
20 31
21 31
22 31
23 31
24 31
25 31
26 31
27 31
28 30
29 31
30 31
31 31
32 31
33 31
34 31
35 31
36 31
37 31
38 30
39 31
40 31
41 31
42 31
43 30
44 30
45 30
Sorry... the problems continue! The output below is from a CEcalibration of FragPipe results. This time the mzML was written by FragPipe (not MSconvert) and the raw data is ddaPASEF style (so .tdf or .d):
python3 -m oktoberfest --config_path ~/CEcalibration_config.json
2023-10-09 10:49:07,230 - INFO - oktoberfest::main Oktoberfest version 0.5.0
Copyright 2023, Wilhelmlab at Technical University of Munich
2023-10-09 10:49:07,230 - INFO - oktoberfest.utils.config::read Reading configuration from /home/tobiasko/CEcalibration_config.json
2023-10-09 10:49:07,231 - INFO - oktoberfest.utils.config::read Reading configuration from /home/tobiasko/CEcalibration_config.json
2023-10-09 10:49:07,232 - INFO - oktoberfest.runner::run_ce_calibration Found 36 files in the spectra directory.
2023-10-09 10:49:07,232 - INFO - oktoberfest.runner::_preprocess Converting search results from /scratch/cpanse/PXD028735/ddaPASEF/FragPipeOutput/20230714_0922 to internal search result.
78%|███████████████████████████████████████████████████████████████████████████████████████████▊ | 28/36 [37:15<09:15, 69.49s/it]100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 36/36 [55:31<00:00, 92.55s/it]
2023-10-09 11:46:33,629 - INFO - spectrum_io.search_result.search_results::filter_valid_prosit_sequences #sequences before filtering for valid prosit sequences: 8785672
2023-10-09 11:47:03,017 - INFO - spectrum_io.search_result.search_results::filter_valid_prosit_sequences #sequences after filtering for valid prosit sequences: 8229990
2023-10-09 11:49:47,459 - INFO - oktoberfest.runner::_preprocess Read 8229990 PSMs from /scratch/cpanse/PXD028735/ddaPASEF/FragPipeOutput/20230714_0922/msms/msms.prosit
2023-10-09 11:50:05,883 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/ddaPASEF/FragPipeOutput/20230714_0922/msms/LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_01_uncalibrated.rescore
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /usr/lib/python3.9/runpy.py:197 in _run_module_as_main │
│ │
│ 194 │ main_globals = sys.modules["__main__"].__dict__ │
│ 195 │ if alter_argv: │
│ 196 │ │ sys.argv[0] = mod_spec.origin │
│ ❱ 197 │ return _run_code(code, main_globals, None, │
│ 198 │ │ │ │ │ "__main__", mod_spec) │
│ 199 │
│ 200 def run_module(mod_name, init_globals=None, │
│ │
│ /usr/lib/python3.9/runpy.py:87 in _run_code │
│ │
│ 84 │ │ │ │ │ __loader__ = loader, │
│ 85 │ │ │ │ │ __package__ = pkg_name, │
│ 86 │ │ │ │ │ __spec__ = mod_spec) │
│ ❱ 87 │ exec(code, run_globals) │
│ 88 │ return run_globals │
│ 89 │
│ 90 def _run_module_code(code, init_globals=None, │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/__main__.py:39 in │
│ <module> │
│ │
│ 36 │
│ 37 if __name__ == "__main__": │
│ 38 │ traceback.install() │
│ ❱ 39 │ main() # pragma: no cover │
│ 40 │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/__main__.py:34 in main │
│ │
│ 31 │ logger.info(f"Oktoberfest version {__version__}\n{__copyright__}") │
│ 32 │ │
│ 33 │ args = _parse_args() │
│ ❱ 34 │ runner.run_job(args.config_path) │
│ 35 │
│ 36 │
│ 37 if __name__ == "__main__": │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py:366 in run_job │
│ │
│ 363 │ if job_type == "SpectralLibraryGeneration": │
│ 364 │ │ generate_spectral_lib(config_path) │
│ 365 │ elif job_type == "CollisionEnergyCalibration": │
│ ❱ 366 │ │ run_ce_calibration(config_path) │
│ 367 │ elif job_type == "Rescoring": │
│ 368 │ │ run_rescoring(config_path) │
│ 369 │ else: │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py:229 in │
│ run_ce_calibration │
│ │
│ 226 │ proc_dir = config.output / "proc" │
│ 227 │ proc_dir.mkdir(parents=True, exist_ok=True) │
│ 228 │ │
│ ❱ 229 │ _preprocess(spectra_files, config) │
│ 230 │ │
│ 231 │ processing_pool = JobPool(processes=config.num_threads) │
│ 232 │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py:45 in │
│ _preprocess │
│ │
│ 42 │ │ search_results = pp.filter_peptides_for_model(peptides=search_results, model=con │
│ 43 │ │ │
│ 44 │ │ # split search results │
│ ❱ 45 │ │ pp.split_search( │
│ 46 │ │ │ search_results=search_results, │
│ 47 │ │ │ output_dir=config.output / "msms", │
│ 48 │ │ │ filenames=[spectra_file.stem for spectra_file in spectra_files], │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessi │
│ ng.py:314 in split_search │
│ │
│ 311 │ for filename in filenames: │
│ 312 │ │ output_file = (output_dir / filename).with_suffix(".rescore") │
│ 313 │ │ logger.info(f"Creating split msms.txt file {output_file}") │
│ ❱ 314 │ │ grouped_search_results.get_group(filename).to_csv(output_file) │
│ 315 │
│ 316 │
│ 317 def merge_spectra_and_peptides(spectra: pd.DataFrame, search: pd.DataFrame) -> Spectra: │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/pandas/core/groupby/groupby.py:817 in │
│ get_group │
│ │
│ 814 │ │ │
│ 815 │ │ inds = self._get_index(name) │
│ 816 │ │ if not len(inds): │
│ ❱ 817 │ │ │ raise KeyError(name) │
│ 818 │ │ │
│ 819 │ │ return obj._take_with_is_copy(inds, axis=self.axis) │
│ 820 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_01_uncalibrated'
Not sure if the Oktoberfest expects that .raw files and .mzML files are the exact same base file name (except the very last .postfix). FragPipe adds this _uncalibrated
tag:
ls -lh /scratch/cpanse/PXD028735/ddaPASEF/
total 150G
-rw-rw-r--+ 1 tobiasko SG_Employees 2.9K Jul 11 10:08 ddaPASEF.fp-manifest
-rw-rw-r--+ 1 tobiasko SG_Employees 11K Jul 14 09:21 Default_zero_Oktoberfest.workflow
-rw-rw-r--+ 1 tobiasko SG_Employees 11K Jul 12 11:12 Default_zero.workflow
drwxrwxr-x+ 1 tobiasko SG_Employees 36 Jul 18 09:55 FragPipeOutput
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:42 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_01.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 09:41 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_01.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 09:42 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_01_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:50 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_02.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 10:14 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_02.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 10:15 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_02_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:51 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_03.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 10:51 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_03.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.1G Jul 14 10:52 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_03_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:51 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_01.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 11:47 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_01.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.1G Jul 14 11:48 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_01_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:52 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_02.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 12:04 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_02.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.1G Jul 14 12:04 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_02_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 16:08 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_03.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 12:19 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_03.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 12:20 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_03_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:53 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Gamma_01.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 12:39 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Gamma_01.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.1G Jul 14 12:40 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Gamma_01_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:54 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Gamma_02.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 12:58 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Gamma_02.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 12:58 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Gamma_02_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 16:10 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Gamma_03.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 13:31 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Gamma_03.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.1G Jul 14 13:33 LFQ_timsTOFPro_PASEF_Condition_A_Sample_Gamma_03_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:54 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Alpha_01.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 15:05 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Alpha_01.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 15:08 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Alpha_01_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:55 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Alpha_02.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 16:12 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Alpha_02.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 16:13 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Alpha_02_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:55 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Alpha_03.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 16:37 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Alpha_03.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 16:38 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Alpha_03_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 16:13 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Beta_01.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 16:54 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Beta_01.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.1G Jul 14 16:55 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Beta_01_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:57 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Beta_02.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 17:08 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Beta_02.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 17:08 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Beta_02_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:57 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Beta_03.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 17:22 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Beta_03.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 17:22 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Beta_03_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:58 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Gamma_01.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 17:36 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Gamma_01.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 17:37 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Gamma_01_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:58 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Gamma_02.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 17:49 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Gamma_02.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 17:49 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Gamma_02_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:59 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Gamma_03.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 18:01 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Gamma_03.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 18:02 LFQ_timsTOFPro_PASEF_Condition_B_Sample_Gamma_03_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 09:59 LFQ_timsTOFPro_PASEF_Ecoli_01.d
-rw-rw-r--+ 1 tobiasko SG_Employees 874M Jul 14 18:11 LFQ_timsTOFPro_PASEF_Ecoli_01.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 2.0G Jul 14 18:11 LFQ_timsTOFPro_PASEF_Ecoli_01_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:00 LFQ_timsTOFPro_PASEF_Ecoli_02.d
-rw-rw-r--+ 1 tobiasko SG_Employees 909M Jul 14 18:20 LFQ_timsTOFPro_PASEF_Ecoli_02.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 2.1G Jul 14 18:21 LFQ_timsTOFPro_PASEF_Ecoli_02_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:00 LFQ_timsTOFPro_PASEF_Ecoli_03.d
-rw-rw-r--+ 1 tobiasko SG_Employees 901M Jul 14 18:30 LFQ_timsTOFPro_PASEF_Ecoli_03.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 2.1G Jul 14 18:30 LFQ_timsTOFPro_PASEF_Ecoli_03_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:00 LFQ_timsTOFPro_PASEF_Human_01.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 18:43 LFQ_timsTOFPro_PASEF_Human_01.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.1G Jul 14 18:43 LFQ_timsTOFPro_PASEF_Human_01_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:01 LFQ_timsTOFPro_PASEF_Human_02.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 18:56 LFQ_timsTOFPro_PASEF_Human_02.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.1G Jul 14 18:57 LFQ_timsTOFPro_PASEF_Human_02_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 16:15 LFQ_timsTOFPro_PASEF_Human_03.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 19:10 LFQ_timsTOFPro_PASEF_Human_03.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.1G Jul 14 19:11 LFQ_timsTOFPro_PASEF_Human_03_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:02 LFQ_timsTOFPro_PASEF_QC_01.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 19:24 LFQ_timsTOFPro_PASEF_QC_01.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 19:25 LFQ_timsTOFPro_PASEF_QC_01_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:02 LFQ_timsTOFPro_PASEF_QC_02.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 19:38 LFQ_timsTOFPro_PASEF_QC_02.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 19:39 LFQ_timsTOFPro_PASEF_QC_02_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:03 LFQ_timsTOFPro_PASEF_QC_03.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 19:52 LFQ_timsTOFPro_PASEF_QC_03.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 19:53 LFQ_timsTOFPro_PASEF_QC_03_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:03 LFQ_timsTOFPro_PASEF_QC_04.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 20:06 LFQ_timsTOFPro_PASEF_QC_04.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.1G Jul 14 20:07 LFQ_timsTOFPro_PASEF_QC_04_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:04 LFQ_timsTOFPro_PASEF_QC_05.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 20:20 LFQ_timsTOFPro_PASEF_QC_05.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 20:20 LFQ_timsTOFPro_PASEF_QC_05_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:05 LFQ_timsTOFPro_PASEF_QC_06.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 20:33 LFQ_timsTOFPro_PASEF_QC_06.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 20:33 LFQ_timsTOFPro_PASEF_QC_06_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:05 LFQ_timsTOFPro_PASEF_QC_07.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.5G Jul 14 20:46 LFQ_timsTOFPro_PASEF_QC_07.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 3.0G Jul 14 20:46 LFQ_timsTOFPro_PASEF_QC_07_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 16:16 LFQ_timsTOFPro_PASEF_QC_08.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.3G Jul 14 20:59 LFQ_timsTOFPro_PASEF_QC_08.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 2.9G Jul 14 20:59 LFQ_timsTOFPro_PASEF_QC_08_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:06 LFQ_timsTOFPro_PASEF_QC_09.d
-rw-rw-r--+ 1 tobiasko SG_Employees 200M Jul 14 21:04 LFQ_timsTOFPro_PASEF_QC_09.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 692M Jul 14 21:04 LFQ_timsTOFPro_PASEF_QC_09_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:06 LFQ_timsTOFPro_PASEF_Yeast_01.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.3G Jul 14 21:16 LFQ_timsTOFPro_PASEF_Yeast_01.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 2.9G Jul 14 21:17 LFQ_timsTOFPro_PASEF_Yeast_01_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:07 LFQ_timsTOFPro_PASEF_Yeast_02.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.3G Jul 14 21:29 LFQ_timsTOFPro_PASEF_Yeast_02.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 2.9G Jul 14 21:30 LFQ_timsTOFPro_PASEF_Yeast_02_uncalibrated.mzML
drwxrwxr-x+ 1 tobiasko SG_Employees 56 Jul 7 10:07 LFQ_timsTOFPro_PASEF_Yeast_03.d
-rw-rw-r--+ 1 tobiasko SG_Employees 1.3G Jul 14 21:43 LFQ_timsTOFPro_PASEF_Yeast_03.mzBIN
-rw-rw-r--+ 1 tobiasko SG_Employees 2.8G Jul 14 21:44 LFQ_timsTOFPro_PASEF_Yeast_03_uncalibrated.mzML
-rwxrw-r--+ 1 tobiasko SG_Employees 233 Jul 14 09:26 runfragpipe.bash
The filename without the extension has to match. I don't know why they add _uncalibrated but then they should also add this suffix to the search results. We cannot possibly know how to match arbitrary filename manipulations and I suggest that fragpipe is fixing this on their side. For now, you would need to correct the filenames. I could maybe add a check if "_uncalibrated" exists as long as this is always the case. But it gets difficult if every tool changes the filename somehow.
Jip! I totally agree, not their best idea... could I also use a hard link for this purpose?
Jip! I totally agree, not their best idea... could I also use a hard link for this purpose?
Yes Oktoberfest supports links. I would rather use a symlink though, i.e. ln -s LFQ_timsTOFPro_PASEF_Yeast_02_uncalibrated.mzML LFQ_timsTOFPro_PASEF_Yeast_02_uncalibrated.mzML
for example.
hmmm, I think spectrum_io
has some type of problem with the symlinks:
python3 -m oktoberfest --config_path ~/CEcalibration_config.json
2023-10-09 15:36:57,466 - INFO - oktoberfest::main Oktoberfest version 0.5.0
Copyright 2023, Wilhelmlab at Technical University of Munich
2023-10-09 15:36:57,468 - INFO - oktoberfest.utils.config::read Reading configuration from /home/tobiasko/CEcalibration_config.json
2023-10-09 15:36:57,469 - INFO - oktoberfest.utils.config::read Reading configuration from /home/tobiasko/CEcalibration_config.json
2023-10-09 15:36:57,470 - INFO - oktoberfest.runner::run_ce_calibration Found 36 files in the spectra directory.
2023-10-09 15:36:57,470 - INFO - oktoberfest.utils.process_step::is_done Skipping preprocessing_search step because /scratch/cpanse/PXD028735/ddaPASEF/FragPipeOutput/20230714_0922/proc/preprocessing_search.done was found.
Waiting for tasks to complete: 0%| | 0/36 [00:00<?, ?it/s]2023-10-09 15:36:57,871 - INFO - spectrum_io.raw.msraw::read_mzml Reading mzML file: /scratch/cpanse/PXD028735/ddaPASEF/links/LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_01.mzML
2023-10-09 15:36:57,873 - INFO - spectrum_io.raw.msraw::read_mzml Reading mzML file: /scratch/cpanse/PXD028735/ddaPASEF/links/LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_02.mzML
2023-10-09 15:36:57,877 - INFO - spectrum_io.raw.msraw::read_mzml Reading mzML file: /scratch/cpanse/PXD028735/ddaPASEF/links/LFQ_timsTOFPro_PASEF_Condition_A_Sample_Alpha_03.mzML
Waiting for tasks to complete: 0%| | 0/36 [00:00<?, ?it/s]
2023-10-09 15:36:57,881 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool Caught Unknown exception, terminating workers
2023-10-09 15:36:57,881 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool Caught Unknown exception, terminating workers
2023-10-09 15:36:57,881 - INFO - spectrum_io.raw.msraw::read_mzml Reading mzML file: /scratch/cpanse/PXD028735/ddaPASEF/links/LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_01.mzML
2023-10-09 15:36:57,882 - INFO - spectrum_io.raw.msraw::read_mzml Reading mzML file: /scratch/cpanse/PXD028735/ddaPASEF/links/LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_02.mzML
2023-10-09 15:36:57,882 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 199, in _ce_calib
library = _annotate_and_get_library(spectra_file, config)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 67, in _annotate_and_get_library
spectra = pp.load_spectra(mzml_file)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessing.py", line 372, in load_spectra
return ThermoRaw.read_mzml(
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/spectrum_io/raw/msraw.py", line 155, in read_mzml
fragmentation = spec["scanList"]["scan"][0]["filter string"].split("@")[1][:3].upper()
IndexError: list index out of range
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/utils/multiprocessing_pool.py", line 43, in check_pool
outputs.append(res.get(timeout=10000)) # 10000 seconds = ~3 hours
File "/usr/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
IndexError: list index out of range
2023-10-09 15:36:57,882 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 199, in _ce_calib
library = _annotate_and_get_library(spectra_file, config)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py", line 67, in _annotate_and_get_library
spectra = pp.load_spectra(mzml_file)
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessing.py", line 372, in load_spectra
return ThermoRaw.read_mzml(
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/spectrum_io/raw/msraw.py", line 155, in read_mzml
fragmentation = spec["scanList"]["scan"][0]["filter string"].split("@")[1][:3].upper()
IndexError: list index out of range
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/utils/multiprocessing_pool.py", line 43, in check_pool
outputs.append(res.get(timeout=10000)) # 10000 seconds = ~3 hours
File "/usr/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
IndexError: list index out of range
2023-10-09 15:36:57,882 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool list index out of range
2023-10-09 15:36:57,882 - ERROR - oktoberfest.utils.multiprocessing_pool::check_pool list index out of range
2023-10-09 15:36:57,885 - INFO - spectrum_io.raw.msraw::read_mzml Reading mzML file: /scratch/cpanse/PXD028735/ddaPASEF/links/LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_03.mzML
but
head /scratch/cpanse/PXD028735/ddaPASEF/links/LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_03.mzML
<?xml version='1.0' encoding='UTF-8'?>
<indexedmzML xmlns="http://psi.hupo.org/ms/mzml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.2_idx.xsd">
<mzML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" id="LFQ_timsTOFPro_PASEF_Condition_A_Sample_Beta_03.d" xsi:schemaLocation="http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.2_idx.xsd">
<cvList count="2">
<cv id="MS" fullName="Proteomics Standards Initiative Mass Spectrometry Ontology" version="4.1.103" URI="https://raw.githubusercontent.com/HUPO-PSI/psi-ms-CV/master/psi-ms.obo"/>
<cv id="UO" fullName="Unit Ontology" version="09:04:2014" URI="https://raw.githubusercontent.com/bio-ontology-research-group/unit-ontology/master/unit.obo"/>
</cvList>
<fileDescription>
<fileContent>
<cvParam cvRef="MS" accession="MS:1000579" name="MS1 spectrum" value=""/>
This is not because of symlinks but because of the "filter string" accession, which is not present. This is used in spectrum-io to determine the fragmentation type (supported are HCD and CID at the moment) as well as the mz range of the spectrum. The problem is, that this is an accession unique to thermo instruments it seems. I.e. it is not there all the time but we need it for annotation of fragment peaks.
@WassimG do you have an idea how to do this better? It seems HUPO PSI made the terms HCD/CID obsolete and suggests a different accession, which to me doesn't even make sense: https://raw.githubusercontent.com/HUPO-PSI/psi-ms-CV/master/psi-ms.obo (search for HCD). We need to get the information in a different way if it isn't thermo data.
ok. I loooooove mzML! So it is this (searched for HCD):
[Term]
id: MS:1000422
name: beam-type collision-induced dissociation
def: "A collision-induced dissociation process that occurs in a beam-type collision cell." [PSI:MS]
synonym: "HCD" EXACT []
is_a: MS:1000133 ! collision-induced dissociation
But why obsolete?
I don't know, this is just sth. they say in the mzml documentation. I will have to check the mzML to see which accessions are used to define the scan window and fragmentation type and will come back to you once I know how to solve this.
But why are you so keen on checking that the scans are actually of the HCD/CID fragmentation type? I kind of understood this when the code was still sitting behind Prosit - which only had HCD/CID models trained on Orbitrap data, but now that all kinds of models could become available through Koina... or maybe someone wants to score EtHCD data vs. a model trained on CID data? So in essence, is this check really needed? Maybe, just check if the scan is a fragment ion scan (guess that is the ms level in mzML) and place a warning if the fragmentation method indicated in the scan metadata mismatches the selected model, but even that matching might be difficult to do.
I sort of agree. This is now an issue that is more about do we care about the fragmentation type and FTMS/ITMS/TOF but instead let the user do this. We really only need the scan window and let the user provide the desired mass tolerance used for the search which is already supported in the config. We realised that mzml converted using MSConvert is actually not working at all. Still looking into this in hopes of finding a solution.
ok. crazy! Not at all? But the mzML files written by FragPipe work?
Yes, because you were able to read the information from the filter string attribute, which is not there all the time. What should be there is the scanWindowList and the activation attribute within the precursorList.
I pushed to https://github.com/wilhelm-lab/spectrum_io/pull/75/commits/4a60a9ce8c31b5ecacc5a61b5c386765eff58d62 which should fix your problem. I removed the dependency on the filter string attribute. You should pip uninstall spectrum-io
, then pip install git+https://github.com/wilhelm-lab/spectrum_io.git@fix/mzml_instrumentConfigurationRef
I added some unit tests and they work but please check this before I merge it.
Hi there,
not sure if this is a new problem, or still the old one. I started a CE calibration:
cat CollisionEnergyCalibration_2024_01_03-16_07_20.log
2024-01-03 16:07:20,052 - INFO - oktoberfest.runner::run_job Oktoberfest version 0.5.2
Copyright 2024, Wilhelmlab at Technical University of Munich
2024-01-03 16:07:20,052 - INFO - oktoberfest.runner::run_job Job executed with the following config:
2024-01-03 16:07:20,052 - INFO - oktoberfest.runner::run_job {
"type": "CollisionEnergyCalibration",
"tag": "",
"output": "/scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/",
"inputs": {
"search_results_type": "Msfragger",
"spectra": "/scratch/cpanse/PXD028735/dda/",
"spectra_type": "mzml",
"search_results": "/scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/"
},
"models": {
"intensity": "Prosit_2020_intensity_HCD",
"irt": "Prosit_2019_irt"
},
"prediction_server": "koina.proteomicsdb.org:443",
"numThreads": 1,
"regressionMethod": "spline",
"ssl": true,
"thermoExe": "ThermoRawFileParser.exe",
"massTolerance": 20,
"unitMassTolerance": "ppm",
"ce_alignment_options": {
"ce_range": [
19,
50
],
"use_ransac_model": false
}
}
2024-01-03 16:07:20,052 - INFO - oktoberfest.utils.config::read Reading configuration from CEcalibration_Prosit_2020_intensity_HCD.json
2024-01-03 16:07:20,053 - INFO - oktoberfest.runner::run_ce_calibration Found 45 files in the spectra directory.
2024-01-03 16:07:20,053 - INFO - oktoberfest.runner::_preprocess Converting search results from /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912 to internal search result.
2024-01-03 16:34:45,737 - INFO - spectrum_io.search_result.search_results::filter_valid_prosit_sequences #sequences before filtering for valid prosit sequences: 4599403
2024-01-03 16:34:52,716 - INFO - spectrum_io.search_result.search_results::filter_valid_prosit_sequences #sequences after filtering for valid prosit sequences: 4479334
2024-01-03 16:35:26,613 - INFO - oktoberfest.runner::_preprocess Read 4479334 PSMs from /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/msms.prosit
2024-01-03 16:35:31,779 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_02.rescore
2024-01-03 16:35:32,492 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_04.rescore
2024-01-03 16:35:32,913 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_04.rescore
2024-01-03 16:35:33,336 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_02.rescore
2024-01-03 16:35:33,762 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_02.rescore
2024-01-03 16:35:34,196 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_01.rescore
2024-01-03 16:35:34,596 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_01.rescore
2024-01-03 16:35:35,007 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.rescore
2024-01-03 16:35:35,404 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_02.rescore
2024-01-03 16:35:35,817 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_03.rescore
2024-01-03 16:35:36,227 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_03.rescore
2024-01-03 16:35:36,660 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_04.rescore
2024-01-03 16:35:37,089 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Gamma_03.rescore
2024-01-03 16:35:37,517 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_A_Sample_Beta_01.rescore
2024-01-03 16:35:37,940 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_03.rescore
2024-01-03 16:35:38,369 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_B_Sample_Alpha_04.rescore
2024-01-03 16:35:38,790 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_01.rescore
2024-01-03 16:35:39,286 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_02.rescore
2024-01-03 16:35:39,713 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_03.rescore
2024-01-03 16:35:40,127 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_B_Sample_Beta_04.rescore
2024-01-03 16:35:40,546 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_01.rescore
2024-01-03 16:35:40,953 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_02.rescore
2024-01-03 16:35:41,388 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_04.rescore
2024-01-03 16:35:41,806 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Condition_B_Sample_Gamma_03.rescore
2024-01-03 16:35:42,229 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Ecoli_02.rescore
2024-01-03 16:35:42,374 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Ecoli_01.rescore
2024-01-03 16:35:42,540 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Ecoli_03.rescore
2024-01-03 16:35:42,696 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Human_01.rescore
2024-01-03 16:35:43,100 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Human_02.rescore
2024-01-03 16:35:43,491 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Human_03.rescore
2024-01-03 16:35:43,880 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_QC_01.rescore
2024-01-03 16:35:44,251 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_QC_02.rescore
2024-01-03 16:35:44,640 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_QC_03.rescore
2024-01-03 16:35:45,037 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_QC_04.rescore
2024-01-03 16:35:45,422 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_QC_05.rescore
2024-01-03 16:35:45,817 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_QC_06.rescore
2024-01-03 16:35:46,217 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_QC_07.rescore
2024-01-03 16:35:46,620 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_QC_08.rescore
2024-01-03 16:35:47,024 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_QC_09.rescore
2024-01-03 16:35:47,417 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_QC_10.rescore
2024-01-03 16:35:47,815 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_QC_11.rescore
2024-01-03 16:35:48,212 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_QC_12.rescore
2024-01-03 16:35:48,604 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Yeast_01.rescore
2024-01-03 16:35:48,947 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Yeast_02.rescore
2024-01-03 16:35:49,280 - INFO - oktoberfest.preprocessing.preprocessing::split_search Creating split msms.txt file /scratch/cpanse/PXD028735/dda/FragPipeOutput/20230714_0912/out/msms/LFQ_Orbitrap_DDA_Yeast_03.rescore
2024-01-03 16:35:49,774 - INFO - spectrum_io.raw.msraw::_read_mzml_pyteomics Reading mzML file: /scratch/cpanse/PXD028735/dda/LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_02.mzML
and it fails when Oktoberfest starts reading from the first mzML files:
│ ❱ 84 │ │ │ return func(self, *args, **kwargs) │
│ 85 │ │ finally: │
│ 86 │ │ │ self.seek(position) │
│ 87 │ return wrapped │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/pyteomics/xml.py:1150 in get_by_id │
│ │
│ 1147 │ │ │ │ id_key = self._indexed_tag_keys.get(element_type) │
│ 1148 │ │ │ elem = self._find_by_id_no_reset(elem_id, id_key=id_key) │
│ 1149 │ │ except (KeyError, AttributeError, etree.LxmlError): │
│ ❱ 1150 │ │ │ elem = self._find_by_id_reset(elem_id, id_key=id_key) │
│ 1151 │ │ data = self._get_info_smart(elem, **kwargs) │
│ 1152 │ │ return data │
│ 1153 │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/pyteomics/auxiliary/file_helpers.py:8 │
│ 4 in wrapped │
│ │
│ 81 │ │ position = self.tell() │
│ 82 │ │ self.seek(0) │
│ 83 │ │ try: │
│ ❱ 84 │ │ │ return func(self, *args, **kwargs) │
│ 85 │ │ finally: │
│ 86 │ │ │ self.seek(position) │
│ 87 │ return wrapped │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/pyteomics/xml.py:1117 in │
│ _find_by_id_reset │
│ │
│ 1114 │ │
│ 1115 │ @_keepstate │
│ 1116 │ def _find_by_id_reset(self, elem_id, id_key=None): │
│ ❱ 1117 │ │ return self._find_by_id_no_reset(elem_id, id_key=id_key) │
│ 1118 │ │
│ 1119 │ @_keepstate │
│ 1120 │ def get_by_id(self, elem_id, id_key=None, element_type=None, **kwargs): │
│ │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/pyteomics/xml.py:661 in │
│ _find_by_id_no_reset │
│ │
│ 658 │ │ │ │ │ return elem │
│ 659 │ │ │ │ if not found: │
│ 660 │ │ │ │ │ elem.clear() │
│ ❱ 661 │ │ raise KeyError(elem_id) │
│ 662 │ │
│ 663 │ @_keepstate │
│ 664 │ def get_by_id(self, elem_id, **kwargs): │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'commonInstrumentParams'
BUT, this has worked in a previous attempt.
It's a new one. In order to support different models in koina which require the instrument type to be read from the mzml file, we introduced a new column in the internal format that contains this information. This is already implemented in the latests version and works with the mzML files we tested.
I checked the file LFQ_Orbitrap_DDA_Condition_A_Sample_Alpha_01.mzML which I still had and found that MSConvert writes "CommonInstrumentParams", i.e. capital "C" compared to ThermoRawFileParser which writes lowercase. I will fix this asap.
If it isn't too many files you can therefore manually change the mzML files for now if you want...
ok, thx for the fast reply. I will wait for the fix. The last thing we want is to create additional confusion by introducing manual changes in the .mzML files. I 😍 .mzML
@tobiasko Reading the instrumentConfiguration from mzML files that were converted with MSConvert is now working with the current development branch of oktoberfest but not with the stable release, since the newest spectrum-io isn't supported by the current stable version of oktoberfest due to a breaking change.
If sth. else doesn't work with regards to reading the mzML, please consider opening a new issue.
Describe the bug
The above mzML file was generated by MSconvert (Docker container) on Debian Linux with parameters
--mzML --64 --zlib --filter "peakPicking true 1-
To Reproduce
Expected behavior
no error complaining about unsupported mass analysers.
System [please complete the following information]: