wilhelm-lab / oktoberfest

Rescoring and spectral library generation pipeline for proteomics.
MIT License
29 stars 8 forks source link

The model is not known #216

Open tobiasko opened 2 months ago

tobiasko commented 2 months ago

Describe the bug

AlphaPept model is not available.

To Reproduce

(oktoberfest-env) tobiasko@fgcz-c-072:/scratch/cpanse/PXD028735/oktoberfest$ python -m oktoberfest --config_path specLib_config_UP000000625_AlphaPept_ms2_generic
2024-05-08 16:16:06,061 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_UP000000625_AlphaPept_ms2_generic
2024-05-08 16:16:06,061 - INFO - oktoberfest.runner::run_job Oktoberfest version 0.6.2
Copyright 2024, Wilhelmlab at Technical University of Munich
2024-05-08 16:16:06,061 - INFO - oktoberfest.runner::run_job Job executed with the following config:
2024-05-08 16:16:06,061 - INFO - oktoberfest.runner::run_job {
    "type": "SpectralLibraryGeneration",
    "tag": "",
    "models": {
        "intensity": "AlphaPept_ms2_generic",
        "irt": "AlphaPept_rt_generic"
    },
    "prediction_server": "koina.wilhelmlab.org:443",
    "ssl": true,
    "output": "/scratch/cpanse/PXD028735/oktoberfest/SpectralLibraryGeneration/UP000000625/AlphaPept_ms2_generic/",
    "inputs": {
        "library_input": "/scratch/cpanse/PXD028735/fasta/uniprotkb_proteome_UP000000625_2023_07_04.fasta",
        "library_input_type": "fasta"
    },
    "spectralLibraryOptions": {
        "fragmentation": "HCD",
        "collisionEnergy": 35,
        "precursorCharge": [
            2,
            3
        ],
        "minIntensity": 0.0005,
        "batchsize": 10000,
        "format": "msp"
    },
    "fastaDigestOptions": {
        "digestion": "full",
        "missedCleavages": 0,
        "minLength": 7,
        "maxLength": 30,
        "enzyme": "trypsin",
        "specialAas": "KR",
        "db": "target"
    }
}
2024-05-08 16:16:06,061 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_UP000000625_AlphaPept_ms2_generic
2024-05-08 16:16:07,023 - INFO - oktoberfest.preprocessing.preprocessing::process_and_filter_spectra_data No of sequences before filtering is 123274
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /usr/lib/python3.9/runpy.py:197 in _run_module_as_main                                           │
│                                                                                                  │
│   194 │   main_globals = sys.modules["__main__"].__dict__                                        │
│   195 │   if alter_argv:                                                                         │
│   196 │   │   sys.argv[0] = mod_spec.origin                                                      │
│ ❱ 197 │   return _run_code(code, main_globals, None,                                             │
│   198 │   │   │   │   │    "__main__", mod_spec)                                                 │
│   199                                                                                            │
│   200 def run_module(mod_name, init_globals=None,                                                │
│                                                                                                  │
│ /usr/lib/python3.9/runpy.py:87 in _run_code                                                      │
│                                                                                                  │
│    84 │   │   │   │   │      __loader__ = loader,                                                │
│    85 │   │   │   │   │      __package__ = pkg_name,                                             │
│    86 │   │   │   │   │      __spec__ = mod_spec)                                                │
│ ❱  87 │   exec(code, run_globals)                                                                │
│    88 │   return run_globals                                                                     │
│    89                                                                                            │
│    90 def _run_module_code(code, init_globals=None,                                              │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/__main__.py:37 in         │
│ <module>                                                                                         │
│                                                                                                  │
│   34                                                                                             │
│   35 if __name__ == "__main__":                                                                  │
│   36 │   traceback.install()                                                                     │
│ ❱ 37 │   main()  # pragma: no cover                                                              │
│   38                                                                                             │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/__main__.py:32 in main    │
│                                                                                                  │
│   29 def main():                                                                                 │
│   30 │   """Execution of oktoberfest from terminal."""                                           │
│   31 │   args = _parse_args()                                                                    │
│ ❱ 32 │   runner.run_job(args.config_path)                                                        │
│   33                                                                                             │
│   34                                                                                             │
│   35 if __name__ == "__main__":                                                                  │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py:645 in run_job  │
│                                                                                                  │
│   642 │                                                                                          │
│   643 │   try:                                                                                   │
│   644 │   │   if job_type == "SpectralLibraryGeneration":                                        │
│ ❱ 645 │   │   │   generate_spectral_lib(config_path)                                             │
│   646 │   │   elif job_type == "CollisionEnergyCalibration":                                     │
│   647 │   │   │   run_ce_calibration(config_path)                                                │
│   648 │   │   elif job_type == "Rescoring":                                                      │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py:320 in          │
│ generate_spectral_lib                                                                            │
│                                                                                                  │
│   317 │   config = Config()                                                                      │
│   318 │   config.read(config_path)                                                               │
│   319 │                                                                                          │
│ ❱ 320 │   spec_library = _speclib_from_digestion(config)                                         │
│   321 │                                                                                          │
│   322 │   server_kwargs = {                                                                      │
│   323 │   │   "server_url": config.prediction_server,                                            │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/runner.py:239 in          │
│ _speclib_from_digestion                                                                          │
│                                                                                                  │
│   236 │   data_dir = config.output / "data"                                                      │
│   237 │   if not pp_and_filter_step.is_done():                                                   │
│   238 │   │   data_dir.mkdir(exist_ok=True)                                                      │
│ ❱ 239 │   │   spec_library = pp.process_and_filter_spectra_data(                                 │
│   240 │   │   │   library=spec_library, model=config.models["intensity"], tmt_label=config.tag   │
│   241 │   │   )                                                                                  │
│   242 │   │   spec_library.write_as_hdf5(data_dir / f"{library_file.stem}_filtered.hdf5").join   │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessi │
│ ng.py:221 in process_and_filter_spectra_data                                                     │
│                                                                                                  │
│   218 │                                                                                          │
│   219 │   # filter                                                                               │
│   220 │   logger.info(f"No of sequences before filtering is {len(library.spectra_data)}")        │
│ ❱ 221 │   library.spectra_data = filter_peptides_for_model(library.spectra_data, model)          │
│   222 │   logger.info(f"No of sequences after filtering is {len(library.spectra_data)}")         │
│   223 │                                                                                          │
│   224 │   library.spectra_data["MASS"] = library.spectra_data["MODIFIED_SEQUENCE"].apply(lambd   │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-env/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessi │
│ ng.py:157 in filter_peptides_for_model                                                           │
│                                                                                                  │
│   154 │   │   │   "max_charge": 6,                                                               │
│   155 │   │   }                                                                                  │
│   156 │   else:                                                                                  │
│ ❱ 157 │   │   raise ValueError(f"The model {model} is not known.")                               │
│   158 │                                                                                          │
│   159 │   return filter_peptides(peptides, **filter_kwargs)                                      │
│   160                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: The model AlphaPept_ms2_generic is not known.
WARNING:root:WARNING: Temp mmap arrays were written to /tmp/temp_mmap_8bb_osad. Cleanup of this folder is OS dependant, and might need to be triggered manually! Current space: 38,922,334,208

Expected behavior

According to Koina website available and running.

System [please complete the following information]:

Additional context

Same for ms2pip_2021_HCD. Prosit_2020_intensity_HCD works.

(oktoberfest-env) tobiasko@fgcz-c-072:/scratch/cpanse/PXD028735/oktoberfest$ python -m oktoberfest --config_path specLib_config_test
2024-05-08 16:24:19,974 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_test
2024-05-08 16:24:19,975 - INFO - oktoberfest.runner::run_job Oktoberfest version 0.6.2
Copyright 2024, Wilhelmlab at Technical University of Munich
2024-05-08 16:24:19,975 - INFO - oktoberfest.runner::run_job Job executed with the following config:
2024-05-08 16:24:19,975 - INFO - oktoberfest.runner::run_job {
    "type": "SpectralLibraryGeneration",
    "tag": "",
    "models": {
        "intensity": "Prosit_2020_intensity_HCD",
        "irt": "Prosit_2019_irt"
    },
    "prediction_server": "koina.wilhelmlab.org:443",
    "ssl": true,
    "output": "./",
    "inputs": {
        "library_input": "/scratch/cpanse/PXD028735/fasta/uniprotkb_proteome_UP000000625_2023_07_04.fasta",
        "library_input_type": "fasta"
    },
    "spectralLibraryOptions": {
        "fragmentation": "HCD",
        "collisionEnergy": 35,
        "precursorCharge": [
            2,
            3
        ],
        "minIntensity": 0.0005,
        "batchsize": 10000,
        "format": "msp"
    },
    "fastaDigestOptions": {
        "digestion": "full",
        "missedCleavages": 0,
        "minLength": 7,
        "maxLength": 30,
        "enzyme": "trypsin",
        "specialAas": "KR",
        "db": "target"
    }
}
2024-05-08 16:24:19,975 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_test
2024-05-08 16:24:21,571 - INFO - oktoberfest.preprocessing.preprocessing::process_and_filter_spectra_data No of sequences before filtering is 123274
2024-05-08 16:24:21,842 - INFO - oktoberfest.preprocessing.preprocessing::process_and_filter_spectra_data No of sequences after filtering is 122826
2024-05-08 16:24:23,517 - INFO - spectrum_io.file.hdf5::write_dataset Data written to data/prosit_input_filtered.hdf5
2024-05-08 16:24:23,525 - INFO - spectrum_io.file.hdf5::write_dataset Data appended to data/prosit_input_filtered.hdf5
2024-05-08 16:24:23,527 - INFO - spectrum_io.file.hdf5::write_dataset Data appended to data/prosit_input_filtered.hdf5
Getting predictions: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [01:13<00:00,  5.67s/it, failed=0, successful=13]
Writing library: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [01:13<00:00,  5.67s/it, missing=0, successful=13]
2024-05-08 16:25:37,354 - INFO - oktoberfest.runner::generate_spectral_lib Finished writing the library to disk
WARNING:root:WARNING: Temp mmap arrays were written to /tmp/temp_mmap_caobqeht. Cleanup of this folder is OS dependant, and might need to be triggered manually! Current space: 38,922,338,304
picciama commented 2 months ago

Hi Tobi, yes we are currently in the last stage of integrating this. We have a branch where we switched to a new underlying data structure that supports different models. It should technically be functional, maybe you wanna try it out: https://github.com/wilhelm-lab/oktoberfest/tree/feature/integrate_AnnData

I will merge this asap once I am back from holiday. I still need to add documentation and for alphapept, there was still some issue if the instrument type is not supported, which should be sort of using the next best intrument type.

tobiasko commented 2 months ago

I tried that branch and get a KeyError: 'X'

picciama commented 2 months ago

This is an unidentified amino acid 'X' in the peptide sequence after digestion. There was still a bug after digestion and filtering those out, which I fixed now. Please try again. In my test case, doing an in-silico digest and subsequent filtering, prediction and spectral library generation works now.

However, we still experience issues with the alphapept model, due to a weird bug when receiving the predictions from koina that is difficult to debug. I will update you once it works.

picciama commented 1 month ago

I have now added the functionality to provide the instrument type via config file. Simply add instrument_type = "QE" or any other supported instrument type within the config's input section. This will add an additional column "intrument_type" to the metadata information file that is created after digestion. This is how it should look like:

"inputs": {
    "instrument_type": "QE"
},
...

If you intend to provide peptides instead of performing an in-silico digest, you need to add additional columns to the peptide input now, which are "instrument_types, peptide_length". This is an example of the documentation for this which will be online once the branch is merged:

image

tobiasko commented 1 month ago

ok! Will try.

picciama commented 1 month ago

I just added another bugfix for ms2pip, which was due to an inconsistent shape as it only returns +1 ions in a different order compared to how we store it. So that should work as well now in case you stumbled over an error there.

tobiasko commented 1 month ago

What are your plans for the next release? I asking myself if I would test now on the specific branch, or wait for the next release. What it the status of the branch? Stable/pre-release or work-in-progress?

picciama commented 1 month ago

I won't be able to release a new stable version prior to end of June since I am attending ASMS but I merged lots of stuff in spectrum_fundamentals and spectrum_io, released all of that and merged everything here onto a release branch already. I.e. if you switch to releae/0.7.0, you should find a working version, albeit not fully documented and containing some other open issues I need to deal with before releasing.

tobiasko commented 1 month ago

Why exactly is instrument_type key part of the inputs section and not the spectralLibraryOptions?

tobiasko commented 1 month ago
2024-05-29 15:10:21,819 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_UP000000625_AlphaPept_ms2_generic
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /usr/lib/python3.9/runpy.py:197 in _run_module_as_main                                           │
│                                                                                                  │
│   194 │   main_globals = sys.modules["__main__"].__dict__                                        │
│   195 │   if alter_argv:                                                                         │
│   196 │   │   sys.argv[0] = mod_spec.origin                                                      │
│ ❱ 197 │   return _run_code(code, main_globals, None,                                             │
│   198 │   │   │   │   │    "__main__", mod_spec)                                                 │
│   199                                                                                            │
│   200 def run_module(mod_name, init_globals=None,                                                │
│                                                                                                  │
│ /usr/lib/python3.9/runpy.py:87 in _run_code                                                      │
│                                                                                                  │
│    84 │   │   │   │   │      __loader__ = loader,                                                │
│    85 │   │   │   │   │      __package__ = pkg_name,                                             │
│    86 │   │   │   │   │      __spec__ = mod_spec)                                                │
│ ❱  87 │   exec(code, run_globals)                                                                │
│    88 │   return run_globals                                                                     │
│    89                                                                                            │
│    90 def _run_module_code(code, init_globals=None,                                              │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/__main__.py:37 in  │
│ <module>                                                                                         │
│                                                                                                  │
│   34                                                                                             │
│   35 if __name__ == "__main__":                                                                  │
│   36 │   traceback.install()                                                                     │
│ ❱ 37 │   main()  # pragma: no cover                                                              │
│   38                                                                                             │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/__main__.py:32 in  │
│ main                                                                                             │
│                                                                                                  │
│   29 def main():                                                                                 │
│   30 │   """Execution of oktoberfest from terminal."""                                           │
│   31 │   args = _parse_args()                                                                    │
│ ❱ 32 │   runner.run_job(args.config_path)                                                        │
│   33                                                                                             │
│   34                                                                                             │
│   35 if __name__ == "__main__":                                                                  │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/runner.py:635 in   │
│ run_job                                                                                          │
│                                                                                                  │
│   632 │   """                                                                                    │
│   633 │   conf = Config()                                                                        │
│   634 │   conf.read(config_path)                                                                 │
│ ❱ 635 │   conf.check()                                                                           │
│   636 │                                                                                          │
│   637 │   output_folder = conf.output                                                            │
│   638 │   job_type = conf.job_type                                                               │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/utils/config.py:32 │
│ 6 in check                                                                                       │
│                                                                                                  │
│   323 │   │   │   │   │   " Please check and use a TMT model instead."                           │
│   324 │   │   │   │   )                                                                          │
│   325 │   │   if self.job_type == "SpectralLibraryGeneration":                                   │
│ ❱ 326 │   │   │   self._check_for_speclib()                                                      │
│   327 │   │                                                                                      │
│   328 │   │   if "alphapept" in int_model:                                                       │
│   329 │   │   │   instrument_type = self.instrument_type                                         │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/utils/config.py:36 │
│ 4 in _check_for_speclib                                                                          │
│                                                                                                  │
│   361 │   │   │   instrument_type = self.instrument_type                                         │
│   362 │   │   │   valid_alphapept_instrument_types = ["QE", "LUMOS", "TIMSTOF", "SCIEXTOF"]      │
│   363 │   │   │   if instrument_type is None:                                                    │
│ ❱ 364 │   │   │   │   raise AssertionError(                                                      │
│   365 │   │   │   │   │   f"The chosen intensity model {self.models['intensity']} requires an    │
│   366 │   │   │   │   │   f"Provide one of {valid_alphapept_instrument_types}."                  │
│   367 │   │   │   │   )                                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AssertionError: The chosen intensity model AlphaPept_ms2_generic requires an instrument type. Provide one of ['QE', 'LUMOS', 'TIMSTOF', 'SCIEXTOF'].
WARNING:root:WARNING: Temp mmap arrays were written to /tmp/temp_mmap_b0xpfs7u. Cleanup of this folder is OS dependant, and might need to be triggered manually! Current space: 38,895,894,528
cat specLib_config_UP000000625_AlphaPept_ms2_generic
{
    "type": "SpectralLibraryGeneration",
    "tag": "",
    "models": {
        "intensity": "AlphaPept_ms2_generic",
        "irt": "AlphaPept_rt_generic"
    },
    "prediction_server": "koina.wilhelmlab.org:443",
    "ssl": true,
    "output": "/scratch/cpanse/PXD028735/oktoberfest/SpectralLibraryGeneration/UP000000625/AlphaPept_ms2_generic/",
    "inputs": {
        "library_input": "/scratch/cpanse/PXD028735/fasta/uniprotkb_proteome_UP000000625_2023_07_04.fasta",
        "library_input_type": "fasta",
    "instrument_types": "QE"
    },
    "spectralLibraryOptions": {
        "fragmentation": "HCD",
        "collisionEnergy": 35,
        "precursorCharge": [2,3],
        "minIntensity": 5e-4,
        "batchsize": 10000,
        "format": "msp"
    },
    "fastaDigestOptions": {
        "digestion": "full",
        "missedCleavages": 0,
        "minLength": 7,
        "maxLength": 30,
        "enzyme": "trypsin",
        "specialAas": "KR",
        "db": "target"
    }
}
tobiasko commented 1 month ago

Looks like your code expects instrument_types to be part of the models section:

python -m oktoberfest --config_path specLib_config_UP000000625_ms2pip_2021_HCD_Deeplc_hela_hf
2024-05-29 16:10:39,694 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_UP000000625_ms2pip_2021_HCD_Deeplc_hela_hf
2024-05-29 16:10:39,695 - INFO - oktoberfest.runner::run_job Oktoberfest version 0.7.0
Copyright 2024, Wilhelmlab at Technical University of Munich
2024-05-29 16:10:39,695 - INFO - oktoberfest.runner::run_job Job executed with the following config:
2024-05-29 16:10:39,695 - INFO - oktoberfest.runner::run_job {
    "type": "SpectralLibraryGeneration",
    "tag": "",
    "models": {
        "intensity": "ms2pip_2021_HCD",
        "irt": "Deeplc_hela_hf",
        "instrument_types": "QE"
    },
    "prediction_server": "koina.wilhelmlab.org:443",
    "ssl": true,
    "output": "/scratch/cpanse/PXD028735/oktoberfest/SpectralLibraryGeneration/UP000000625/ms2pip_2021_HCD/",
    "inputs": {
        "library_input": "/scratch/cpanse/PXD028735/fasta/uniprotkb_proteome_UP000000625_2023_07_04.fasta",
        "library_input_type": "fasta"
    },
    "spectralLibraryOptions": {
        "fragmentation": "HCD",
        "collisionEnergy": 35,
        "precursorCharge": [
            2,
            3
        ],
        "minIntensity": 0.0005,
        "batchsize": 10000,
        "format": "msp"
    },
    "fastaDigestOptions": {
        "digestion": "full",
        "missedCleavages": 0,
        "minLength": 7,
        "maxLength": 30,
        "enzyme": "trypsin",
        "specialAas": "KR",
        "db": "target"
    }
}
2024-05-29 16:10:39,695 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_UP000000625_ms2pip_2021_HCD_Deeplc_hela_hf
2024-05-29 16:10:39,695 - INFO - oktoberfest.utils.process_step::is_done Skipping speclib_digested step because /scratch/cpanse/PXD028735/oktoberfest/SpectralLibraryGeneration/UP000000625/ms2pip_2021_HCD/proc/speclib_digested.done was found.
/home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/anndata/_core/aligned_df.py:67: ImplicitModificationWarning: Transforming to str index.
  warnings.warn("Transforming to str index.", ImplicitModificationWarning)
/home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/preprocessing/preprocessing.py:247: ImplicitModificationWarning: Trying to modify attribute `.obs` of view, initializing view as actual.
  library.obs["MASS"] = library.obs["MODIFIED_SEQUENCE"].apply(lambda x: compute_peptide_mass(x))
Getting predictions: 100%|████████████████████████████████████████████████████████████████████████| 13/13 [00:46<00:00,  3.56s/it, failed=0, successful=13]
Writing library: 100%|███████████████████████████████████████████████████████████████████████████| 13/13 [00:46<00:00,  3.56s/it, missing=0, successful=13]
2024-05-29 16:11:28,309 - INFO - oktoberfest.runner::generate_spectral_lib Finished writing the library to disk
WARNING:root:WARNING: Temp mmap arrays were written to /tmp/temp_mmap_4gaw6fr8. Cleanup of this folder is OS dependant, and might need to be triggered manually! Current space: 38,895,886,336
tobiasko commented 1 month ago

...or maybe not!?

python -m oktoberfest --config_path specLib_config_UP000000625_AlphaPept_ms2_generic
2024-05-29 16:16:12,623 - INFO - oktoberfest.utils.config::read Reading configuration from specLib_config_UP000000625_AlphaPept_ms2_generic
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /usr/lib/python3.9/runpy.py:197 in _run_module_as_main                                           │
│                                                                                                  │
│   194 │   main_globals = sys.modules["__main__"].__dict__                                        │
│   195 │   if alter_argv:                                                                         │
│   196 │   │   sys.argv[0] = mod_spec.origin                                                      │
│ ❱ 197 │   return _run_code(code, main_globals, None,                                             │
│   198 │   │   │   │   │    "__main__", mod_spec)                                                 │
│   199                                                                                            │
│   200 def run_module(mod_name, init_globals=None,                                                │
│                                                                                                  │
│ /usr/lib/python3.9/runpy.py:87 in _run_code                                                      │
│                                                                                                  │
│    84 │   │   │   │   │      __loader__ = loader,                                                │
│    85 │   │   │   │   │      __package__ = pkg_name,                                             │
│    86 │   │   │   │   │      __spec__ = mod_spec)                                                │
│ ❱  87 │   exec(code, run_globals)                                                                │
│    88 │   return run_globals                                                                     │
│    89                                                                                            │
│    90 def _run_module_code(code, init_globals=None,                                              │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/__main__.py:37 in  │
│ <module>                                                                                         │
│                                                                                                  │
│   34                                                                                             │
│   35 if __name__ == "__main__":                                                                  │
│   36 │   traceback.install()                                                                     │
│ ❱ 37 │   main()  # pragma: no cover                                                              │
│   38                                                                                             │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/__main__.py:32 in  │
│ main                                                                                             │
│                                                                                                  │
│   29 def main():                                                                                 │
│   30 │   """Execution of oktoberfest from terminal."""                                           │
│   31 │   args = _parse_args()                                                                    │
│ ❱ 32 │   runner.run_job(args.config_path)                                                        │
│   33                                                                                             │
│   34                                                                                             │
│   35 if __name__ == "__main__":                                                                  │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/runner.py:635 in   │
│ run_job                                                                                          │
│                                                                                                  │
│   632 │   """                                                                                    │
│   633 │   conf = Config()                                                                        │
│   634 │   conf.read(config_path)                                                                 │
│ ❱ 635 │   conf.check()                                                                           │
│   636 │                                                                                          │
│   637 │   output_folder = conf.output                                                            │
│   638 │   job_type = conf.job_type                                                               │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/utils/config.py:32 │
│ 6 in check                                                                                       │
│                                                                                                  │
│   323 │   │   │   │   │   " Please check and use a TMT model instead."                           │
│   324 │   │   │   │   )                                                                          │
│   325 │   │   if self.job_type == "SpectralLibraryGeneration":                                   │
│ ❱ 326 │   │   │   self._check_for_speclib()                                                      │
│   327 │   │                                                                                      │
│   328 │   │   if "alphapept" in int_model:                                                       │
│   329 │   │   │   instrument_type = self.instrument_type                                         │
│                                                                                                  │
│ /home/tobiasko/oktoberfest-release070/lib/python3.9/site-packages/oktoberfest/utils/config.py:36 │
│ 4 in _check_for_speclib                                                                          │
│                                                                                                  │
│   361 │   │   │   instrument_type = self.instrument_type                                         │
│   362 │   │   │   valid_alphapept_instrument_types = ["QE", "LUMOS", "TIMSTOF", "SCIEXTOF"]      │
│   363 │   │   │   if instrument_type is None:                                                    │
│ ❱ 364 │   │   │   │   raise AssertionError(                                                      │
│   365 │   │   │   │   │   f"The chosen intensity model {self.models['intensity']} requires an    │
│   366 │   │   │   │   │   f"Provide one of {valid_alphapept_instrument_types}."                  │
│   367 │   │   │   │   )                                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AssertionError: The chosen intensity model AlphaPept_ms2_generic requires an instrument type. Provide one of ['QE', 'LUMOS', 'TIMSTOF', 'SCIEXTOF'].
WARNING:root:WARNING: Temp mmap arrays were written to /tmp/temp_mmap_o3_r3jn1. Cleanup of this folder is OS dependant, and might need to be triggered manually! Current space: 38,896,148,480
picciama commented 1 month ago

It seems you ran into an unfortunate combination of problems:

  1. key name: It is instrument_type, not instrument_types (note the extra 's'). The key is therefore not found and since you are doing library generation, the instrument type cannot be read from spectra files either, so the input for alphapept is simply not defined.

  2. key location: It is correct that the key needs to be in the input section. In your first attempt, you moved the key to the model section, which worked because you were running with ms2pip, which doesn't require the intrument type as an input. The moment you switched to alphapept, it could again not find the key so it failed.

Please check this comment again: https://github.com/wilhelm-lab/oktoberfest/issues/216#issuecomment-2119305520

To be fair, in the peptides input annotation in the screenshot, there is a plural version of the key name, but in the example, the column name is correct. My bad, this is confusing of course.

Why exactly is instrument_type key part of the inputs section and not the spectralLibraryOptions?

Because we don't have a spectral library generation section when we do rescoring. In such cases, the key overwrites what oktoberfest reads from the spectra files. This is required, when the spectra are acquired on an unsupported instrument type, because mapping from the spectra file is not trivial... Same goes for the fragmentation method, which is currently mapped from the spectra file but also not necessarily 100% bulletproof. Is that confusing? Maybe I should allow this key in both locations? I am trying to keep the config as simple as possible...

tobiasko commented 1 month ago

And your codes also asks for "Provide one of {valid_alphapept_instrument_types}.". Pretty sure I started with the singular and changed to types after getting an error. Will try again.