vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
284 stars 53 forks source link

Cannot load file #1109

Closed boehmpjn closed 4 months ago

boehmpjn commented 4 months ago

Hello,

I am Running Dia-NN 1.8 through FragPipe and Dia-NN cannot load most of the files I am providing. I have researched previous problems, and it mostly occurs when the reader for raw files is not installed. However, I am providing MzML files. I have tried running it twice but got the same error messages. The difference between the first and the other files is differing zlib compresseion. log_2024-07-26_03-26-44.txt log_2024-07-26_14-37-11.txt

Thank you for your help and all the best, Paul

vdemichev commented 4 months ago

Hi Paul,

Looks like those mzML files not in a format supported by DIA-NN. Please check that the recommended settings are used for conversion: https://github.com/vdemichev/DiaNN?tab=readme-ov-file#raw-data-formats

Best, Vadim

boehmpjn commented 4 months ago

Hello Vadim,

Thanks for the quick response. I converted the files as described in the FragPipe tutorial (https://fragpipe.nesvilab.org/docs/tutorial_convert.html) and had to filter out UV spectra after using the transform function of the psims package https://mobiusklein.github.io/psims/docs/build/html/transform/mzml.html

from psims.transform.mzml import MzMLTransformer, cvstr

def transform_drop_UV(spectrum):
    if "controllerType=3" in spectrum['id']:
        return None
    elif "controllerType=4" in spectrum['id']:
        return None
    return spectrum

input_file = r"FILEPATH\20240722_LiPMS_IA00106.mzML"
output_file = r"FILEPATH\20240722_LiPMS_IA00106_noUV_v1.mzML"

with open(input_file, 'rb') as in_stream, open(output_file, 'wb') as out_stream:
    MzMLTransformer(in_stream, out_stream, transform_drop_UV).write()

The resulting file is still readable by OpenMS and contains the expected MS1 and MS2 spectra. However, No matter what I do, Dia-NN cannot find the MS2 spectra.

[0:02] File #1/1
[0:02] Loading run X:\Projects\LiP-Collab\20240717_TestOfAscent\FragPipe\raw\20240722_LiPMS_IA00106_noUV_v1.mzML
No MS2 spectra: aborting
ERROR: cannot load the file, skipping
[0:16] 0 library precursors are potentially detectable
[0:16] Processing...
[0:16] Using MS1 mass accuracy: 20 ppm
[0:16] Using mass accuracy: 20 ppm
[0:16] Removing low confidence identifications
[0:16] Removing interfering precursors
[0:16] Too few confident identifications, neural networks will not be used
[0:16] Number of IDs at 0.01 FDR: 0
[0:16] Calculating protein q-values
[0:16] Number of protein isoforms identified at 1% FDR: 0 (precursor-level), 0 (protein-level) (inference performed using proteotypic peptides only)
[0:16] Quantification

Any input on why that is is greatly appreciated! Do I loose important information when filtering?

Best, Paul

vdemichev commented 4 months ago

Hi Paul,

If it's not with these settings https://raw.githubusercontent.com/vdemichev/DiaNN/master/GUI/MSConvert.png it might not work in DIA-NN.

Best, Vadim

boehmpjn commented 4 months ago

Hello Vadim,

Can you please send me the full titleMaker filter configuration? There is some cut off at the end, or the command line version of the MSConvert command. That is the only difference I can find.

Best, Paul

vdemichev commented 4 months ago

Hi Paul,

I am not really proficient with the command line version, the screenshot just shows how it look like in the GUI. The titleMaker is added automatically by MSConvert GUI.

Best, Vadim

boehmpjn commented 4 months ago

Hello Vadim,

Thank you for your help, I got it running with other files but I was not able to run it with the filtered files.

Best, Paul