vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
252 stars 53 forks source link

Thermo RAW file format not supported #298

Closed Munchic closed 2 years ago

Munchic commented 2 years ago

Hello, I get this error trying to run the linux command line version of DiaNN. I'm running Ubuntu 22.04 inside a Docker with 64 cores and 64 GB RAM. As a note, this same raw file and configuration run in the Windows GUI of DiaNN. Thank you for your help!

DIA-NN 1.8 (Data-Independent Acquisition by Neural Networks)
Compiled on Jun 28 2021 10:59:57
Current date and time: Wed Feb  2 01:42:11 2022
Logical CPU cores: 64
Thread number set to 64
Output will be filtered at 0.01 FDR
Deep learning will be used to generate a new in silico spectral library from peptides provided
Library-free search enabled
Min fragment m/z set to 200
Max fragment m/z set to 1800
N-terminal methionine excision enabled
In silico digest will involve cuts at K*,R*
Maximum number of missed cleavages set to 1
Min peptide length set to 7
Max peptide length set to 50
Min precursor m/z set to 300
Max precursor m/z set to 1800
Min precursor charge set to 1
Max precursor charge set to 4
Cysteine carbamidomethylation enabled as a fixed modification
A spectral library will be created from the DIA runs and used to reanalyse them; .quant files will only be saved to disk during the first step
When generating a spectral library, in silico predicted spectra will be retained if deemed more reliable than experimental ones
DIA-NN will optimise the mass accuracy automatically using the first run in the experiment. This is useful primarily for quick initial analyses, when it is not yet known which mass accuracy setting works best for a particular acquisition scheme.
WARNING: MBR turned off, two or more raw files are required

1 files will be processed
[0:00] Loading FASTA _ip2_ip2_data_paser_database__UniProt_human_20141017_contaminant_10-17-2014_reversed.fasta
[0:21] Processing FASTA
[0:45] Assembling elution groups
[1:11] 10250451 precursors generated
[1:11] Protein names missing for some isoforms
[1:11] Gene names missing for some isoforms
[1:11] Library contains 37577 proteins, and 20167 genes
[1:13] Encoding peptides for spectra and RTs prediction
[1:34] Predicting spectra and IMs
[16:04] Predicting RTs
[17:27] Decoding predicted spectra and IMs
[17:43] Decoding RTs
[17:51] Saving the library to lib.predicted.speclib
[18:01] Initialising library

[18:11] File #1/1
[18:11] Loading run data/C20150515_MCF7_T1_cmpds_P-0015_A01_DIA_acq_01.raw
Thermo RAW file format not supported.
vdemichev commented 2 years ago

Hi, for reading .raw files directly, without conversion to .mzML, please use the Windows version of DIA-NN under Windows or Wine.

Best, Vadim

Munchic commented 2 years ago

Hi Vadim,

Thank you, I will try to run them through Wine.

Sincerely, Khoi

Munchic commented 2 years ago

@vdemichev Is there a way to run DiaNN in Wine through the command line directly without running the setup first (because that would require a GUI input). I am trying to set up an automated cloud job for processing raw files with DiaNN which will run it in Ubuntu-based Docker with Wine. Because it's an automated job, I can't have graphical interface installation there. Let me know if my question makes sense, thank you for your help!

vdemichev commented 2 years ago

Hi Khoi, all the setup does is just unpack the files into the chosen folder. So not really necessary to run it, can just copy paste the files.

Best, Vadim

Munchic commented 2 years ago

Understood, thanks! A dumb question: how do I obtain those files including the executable? When I run setup on a Windows machine, it seems to create a folder which contains DiaNN.exe. However, I don't see how to download it directly from the GitHub page. Thank you.

vdemichev commented 2 years ago

You need to run setup once, and then you can copy the files to any machine. No, cannot direclty download from github.

Vadim

Munchic commented 2 years ago

Wanted to give an update that I used ThermoRawFileParser.exe with mono in Linux, and it worked to convert .raw to .mzML and feed into DIA-NN. Thanks!

aupadh12 commented 1 year ago

Hi @Munchic , I hope you are doing well. I wanted to reach out about the implementation of DIA-NN on Kubernetes as a container. We are planning on doing something similar and moving away from running DIA-NN on Windows computers. Can you please help us by letting us know your implementation design and maybe the manifest files for Kubernetes? Also, thank you for giving the solution for handling thermo files which are not supported by DIA-NN Linux env at this time.

Munchic commented 1 year ago

Hi @aupadh12 ! Thank you for reaching out. We implemented a cromwell WDL workflow using the Linux command line + Docker container. You can check out this code: https://github.com/broadinstitute/PANOPLY/tree/dev/third-party-modules/panoply_diann_search

Let me know if that is helpful and good luck with the Kubernetes implementation!