mgleeming / synthedia

Create synthetic DIA LC-MS/MS for proteomics experiments
https://synthedia.org
BSD 3-Clause "New" or "Revised" License
11 stars 0 forks source link

"The specified MaxQuant msms.txt file does not exist" #36

Closed johawahn closed 1 year ago

johawahn commented 1 year ago

Hello! I want to try to process the MaxQuant test file from the resources page using the synthedia package from the command line. However the msm.txt file is not detected. The package works fine with the example prosit file. Could you help me figure out how to solve this? thank you

Operating System: Ubuntu 20.04.5 LTS Kernel: Linux 5.15.0-58-generic

(Synthedia) usr@sysbc-lx-507:$ synthedia --mq_txt_dir '~/230125_synthedia_test/example_MaxQuant_msms/example_msms.txt' --centroid_ms1 --centroid_ms2 2023-01-25 14:49:14,578 - INFO - Started Synthedia 2023-01-25 14:49:14.578185 2023-01-25 14:49:14,578 - INFO - Config args: 2023-01-25 14:49:14,578 - INFO - mq_txt_dir: ~/230125_synthedia_test/example_MaxQuant_msms/example_msms.txt 2023-01-25 14:49:14,578 - INFO - prosit: None 2023-01-25 14:49:14,578 - INFO - prosit_peptide_abundance_model: exponentially_modified_gaussian 2023-01-25 14:49:14,578 - INFO - prosit_peptide_abundance_mean: 22 2023-01-25 14:49:14,578 - INFO - prosit_peptide_abundance_stdev: 1.4 2023-01-25 14:49:14,578 - INFO - prosit_peptide_abundance_emg_k: 1 2023-01-25 14:49:14,578 - INFO - acquisition_schema: None 2023-01-25 14:49:14,578 - INFO - use_existing_peptide_file: None 2023-01-25 14:49:14,578 - INFO - out_dir: /local/home/usr/output 2023-01-25 14:49:14,578 - INFO - output_label: output 2023-01-25 14:49:14,578 - INFO - config: None 2023-01-25 14:49:14,578 - INFO - silent: False 2023-01-25 14:49:14,578 - INFO - write_params: False 2023-01-25 14:49:14,578 - INFO - mq_pepthreshold: 0.001 2023-01-25 14:49:14,578 - INFO - filterTerm: ['CON', 'REV'] 2023-01-25 14:49:14,578 - INFO - num_processors: 8 2023-01-25 14:49:14,578 - INFO - ms1_min_mz: 350 2023-01-25 14:49:14,578 - INFO - ms1_max_mz: 1600 2023-01-25 14:49:14,578 - INFO - ms2_min_mz: 100 2023-01-25 14:49:14,578 - INFO - ms2_max_mz: 2000 2023-01-25 14:49:14,578 - INFO - ms1_resolution: 120000 2023-01-25 14:49:14,579 - INFO - ms2_resolution: 15000 2023-01-25 14:49:14,579 - INFO - ms1_scan_duration: 0.37 2023-01-25 14:49:14,579 - INFO - ms2_scan_duration: 0.037 2023-01-25 14:49:14,579 - INFO - isolation_window: 30 2023-01-25 14:49:14,579 - INFO - resolution_at: 200 2023-01-25 14:49:14,579 - INFO - n_points_gt_fwhm: 3 2023-01-25 14:49:14,579 - INFO - esi_instability: 20 2023-01-25 14:49:14,579 - INFO - ms1_ppm_error_mean: 0 2023-01-25 14:49:14,579 - INFO - ms1_ppm_error_stdev: 0 2023-01-25 14:49:14,579 - INFO - ms2_ppm_error_mean: 0 2023-01-25 14:49:14,579 - INFO - ms2_ppm_error_stdev: 0 2023-01-25 14:49:14,579 - INFO - rt_peak_fwhm_distribution_model: exponentially_modified_gaussian 2023-01-25 14:49:14,579 - INFO - rt_peak_fwhm_distribution_mean: 4 2023-01-25 14:49:14,579 - INFO - rt_peak_fwhm_distribution_stdev: 1 2023-01-25 14:49:14,579 - INFO - rt_peak_fwhm_distribution_emg_k: 1 2023-01-25 14:49:14,579 - INFO - min_rt_peak_fwhm: 1 2023-01-25 14:49:14,579 - INFO - original_run_length: 0 2023-01-25 14:49:14,579 - INFO - new_run_length: 0 2023-01-25 14:49:14,579 - INFO - rt_buffer: 5 2023-01-25 14:49:14,579 - INFO - rt_instability: 15 2023-01-25 14:49:14,579 - INFO - ms1_min_peak_intensity: 100 2023-01-25 14:49:14,579 - INFO - ms2_min_peak_intensity: 10 2023-01-25 14:49:14,579 - INFO - centroid_ms1: True 2023-01-25 14:49:14,579 - INFO - centroid_ms2: True 2023-01-25 14:49:14,579 - INFO - write_empty_spectra: False 2023-01-25 14:49:14,579 - INFO - mz_peak_model: gaussian 2023-01-25 14:49:14,579 - INFO - rt_peak_model: exponentially_modified_gaussian 2023-01-25 14:49:14,579 - INFO - mz_emg_k: 2 2023-01-25 14:49:14,579 - INFO - rt_emg_k: 1 2023-01-25 14:49:14,579 - INFO - prob_missing_in_sample: 0 2023-01-25 14:49:14,579 - INFO - prob_missing_in_group: 0 2023-01-25 14:49:14,579 - INFO - no_isotopes: False 2023-01-25 14:49:14,579 - INFO - tic: False 2023-01-25 14:49:14,579 - INFO - schema: False 2023-01-25 14:49:14,579 - INFO - all: False 2023-01-25 14:49:14,579 - INFO - n_groups: 1 2023-01-25 14:49:14,579 - INFO - samples_per_group: 1 2023-01-25 14:49:14,579 - INFO - between_group_stdev: 1.0 2023-01-25 14:49:14,580 - INFO - within_group_stdev: 0.2 2023-01-25 14:49:14,580 - INFO - decoy_msp_file: None 2023-01-25 14:49:14,580 - INFO - num_decoys: 500 2023-01-25 14:49:14,580 - INFO - simulate_top_n_decoy_fragments: 15 2023-01-25 14:49:14,580 - INFO - decoy_abundance_mean: 22 2023-01-25 14:49:14,580 - INFO - decoy_abundance_stdev: 3 2023-01-25 14:49:14,580 - INFO - preview: False 2023-01-25 14:49:14,580 - INFO - preview_sequence: SAMPLER 2023-01-25 14:49:14,580 - INFO - preview_charge: 2 2023-01-25 14:49:14,580 - INFO - preview_abundance: 1000000 2023-01-25 14:49:14,580 - INFO - Calculating peak parameters 2023-01-25 14:49:14,580 - INFO - The specified MaxQuant msms.txt file does not exist 2023-01-25 14:49:14,580 - INFO - Exiting Traceback (most recent call last): File "/local/home/usr/.local/bin/synthedia", line 8, in sys.exit(main()) File "/local/home/usr/.local/lib/python3.8/site-packages/synthedia/main.py", line 200, in main assembly.assemble(options) File "/local/home/usr/.local/lib/python3.8/site-packages/synthedia/assembly.py", line 398, in assemble options = get_extra_parameters(options) File "/local/home/usr/.local/lib/python3.8/site-packages/synthedia/assembly.py", line 329, in get_extra_parameters options.original_run_length = get_rt_range_from_input_data(options) File "/local/home/usr/.local/lib/python3.8/site-packages/synthedia/assembly.py", line 270, in get_rt_range_from_input_data raise IncorrectInputError(msg) synthedia.assembly.IncorrectInputError: The specified MaxQuant msms.txt file does not exist

mgleeming commented 1 year ago

Hi,

The --mq_txt_dir parameter needs to point to a directory that contains both the msms.txt and evidence.txt files. Above, it's pointing to the msms.txt file specifically. The corresponding evidence.txt file can be downloaded here.

Also note that it's important that the files are named msms.txt and evidence.txt exactly. So where you've named the file example_msms.txt, should be renamed to msms.txt. This is because we're assuming that synthedia is being directed to the txt directory produced by MaxQuant.

For example, with the directory structure:

.
└── example
    └── txt
        ├── evidence.txt
        └── msms.txt

we'd run:

synthedia --mq_txt_dir example/txt --centroid_ms1 --centroid_ms2

Hope that helps :)