MannLabs / alphapeptdeep

Deep learning framework for proteomics
Apache License 2.0
101 stars 19 forks source link

Errors in Transfer #138

Closed hxxhust163 closed 4 months ago

hxxhust163 commented 5 months ago

Hi

I used the cmd 'peptdeep transfer setting.yaml' to refine a model for my own data. The params look like this:

Screenshot from 2024-01-30 15-36-52

I tried two different psm type. One was 'msfragger_pepxml' and another was 'maxquant', but I got two same errors like this: Screenshot from 2024-01-30 15-40-27

I do not know what is wrong with it. Could you please help me figure this out? As you listed in the setting_yaml file 'pfind,diann,speclib_tsv' in psm_type, I wonder if you could provide some example files to show what is the exact columns they contain to be compatible with peptdeep.

Thanks in advance!

Best regards Xiaoxiang

jalew188 commented 5 months ago

It looks like that you are using linux or MacOS, so it may be because you did not install mono to read Thermo Raw files.

See https://github.com/MannLabs/alpharaw/tree/development?tab=readme-ov-file#installation

hxxhust163 commented 5 months ago

Yes, this was done in linux. I also installed peptdeep in windows and run the same cmd as above, but got the same error. And I have installed mono and pythonnet in my linux system.

I want to use the 'speclib_tsv', but I do not know the exact information it contains.

jalew188 commented 5 months ago

Could you share me the log file?

hxxhust163 commented 5 months ago

log.txt

jalew188 commented 5 months ago

peptdeep version is too old, please reinstall it by pip install -U peptdeep alphabase alpharaw

hxxhust163 commented 5 months ago

I followed your advice and run the command pip install -U peptdeep alphabase alpharaw , but got the same error as above. Here is the log file. log.txt

jalew188 commented 5 months ago

Good, now we have some logging information: 2024-01-31 09:41:28> Loaded 0 PSMs for fragment extraction. This means RAW names in msms.txt and your input raw files may be different:

psm_df = psm_df[
        psm_df.raw_name.isin(ms2_file_dict)
    ].reset_index(drop=True)

logging.info(f"Loaded {len(psm_df)} PSMs for fragment extraction.")

See https://github.com/MannLabs/alphapeptdeep/blob/main/peptdeep/pipeline_api.py#L195

Could you please check the raw names in msms.txt, your raw file names, i.e. {raw_name}.raw without folder info.

Or the raw file is broken.

jalew188 commented 4 months ago

@hxxhust163 Hi, I found this bug (in AlphaRaw), I will fix it soon.

jalew188 commented 4 months ago

Will be fixed by https://github.com/MannLabs/alpharaw/pull/30/commits/f4ceadb2eb733e132506ed5c8870b6d66268daf1

jalew188 commented 4 months ago

If you are using python for peptdeep, you can install latest alpharaw pip install -U alpharaw

J-Burgess commented 4 months ago

Hi Zeng, hope you are well!

I had the same error and followed your advice to upgrade pip install -U alpharaw to alpharaw version 0.4.3. It gets me a little further but I get the new error that: Raw file type 'thermo_raw' is not registered in 'ms_reader_provider'

Please find attached my log: transfer_error.txt

Thanks!

Edit: I only had mono installed, missing pythonnet and after pip install pythonnet it gets past fragment extraction but then results in the error: 2024-02-22 11:36:03> Extracted 510499 PSMs.
2024-02-22 11:36:03> Traceback (most recent call last): File "/ibm/hpcfs1/tmp/james.burgess/projects/JB240201_AlphaPeptDeep/alphapeptdeep/peptdeep/pipeline_api.py", line 319, in transfer_learn psm_df, frag_df = match_psms()
ValueError: too many values to unpack (expected 2)

jalew188 commented 4 months ago

@J-Burgess Sorry, I forgot to update peptdeep itself... Now it should work

J-Burgess commented 4 months ago

Hi @jalew188 thanks I managed to produce new refined models all good!

jalew188 commented 4 months ago

Great, I close this issue