qcxms / PlotMS

Plot program for QCxMS spectrum plotting
GNU Lesser General Public License v3.0
9 stars 7 forks source link

export MSP file for NISTMS import [improvement] #7

Open tobigithub opened 3 years ago

tobigithub commented 3 years ago

Hi, this is for plotms 5.1, this is a placeholder for mergers and potential improvements.

The MSP files are NIST MS spectral files that contain meta-information. This allows users to NIST Search Software (Software which is free, the library is not). The software has a number of important features and can search libraries very fast.

I am not sure if plotms should do the export of MSP files directly, this can also be done with a small external Pythons script. The script below reads the output from plotms (with accurate masses and formulas) and then converts that into an MSP file.

In order to do MS/MS search (which requires a precursor ion), such information must be provided. We solved that by adding a meta-data file to each run. So once the accurate mass export is solved, users can just use the example file below.

Example input and output files: qceims.res.txt accuratemass.jdx.txt result-ms2.msp.txt

For example (metadata.txt):

Name: Verapamil
Precursor_type: [M+H]+
Spectrum_type: MS2
PrecursorMZ: 455.2904
Ion_mode: P
InChIKey: SGTNSNPWRIOYBX-UHFFFAOYSA-N
Formula: C27H38N2O4
Comments: protomer of verapamil [M+H]+

and the converter file (ms2-export.py):

### -----------------------------------------------------------------------
### MSP spectrum file exporter in NIST MSP format
### (Tobias Kind // Fiehnlab 2020 // CC-BY license)  
###
### format the MS2 file for accurate mass output and meta-data inclusion
### requires as input: metadata.txt in input subfolder
### reads: accuratemass.jdx from plotms
### exports: result-ms2.msp
### -----------------------------------------------------------------------

# === metadata.txt has 8 pre-defined lines ====================================
# Name: Verapamil
# Precursor_type: [M+H]+
# Spectrum_type: MS2
# PrecursorMZ: 455.2904
# Ion_mode: P
# InChIKey: SGTNSNPWRIOYBX-UHFFFAOYSA-N
# Formula: C27H38N2O4
# Comments: protomer of verapamil [M+H]+
#==============================================================================

import os
import pathlib

### get current file path
workPath = os.getcwd() + '/'
os.sys.path.append(workPath)

### should not contain CR/LF at the end of file
### multiple CR/LF are not handled or need to be stripped
metadataFilename = ('input/metadata.txt')
metadataHandle = workPath + metadataFilename
metadataText = pathlib.Path(metadataHandle).read_text()

# read spectrum file from plotms
spectrumFilename = ('accuratemass.jdx')
spectrumHandle = workPath + spectrumFilename
spectrumText = pathlib.Path(spectrumHandle).read_text()

# get number of peaks from accuratemass.jdx (element [5] in line 6)
numPeaks = spectrumText.split('\n')[5].split('=')[1]

# get mz-abd pairs and add all peaks and annotations
peakPairs = spectrumText.split('\n')[7:int(numPeaks)+7]
peakPairsText = '\n'.join(peakPairs)

# open file for writing new MSP (NIST format)
exportFilename = ('result-ms2.msp')
exportHandle = workPath + exportFilename
with open(exportHandle, 'w+') as f:
    # write complete metadata block
    f.write(metadataText)
    # add number of peaks
    numPeaksText = 'NumPeaks: ' + str(numPeaks) +'\n'
    f.write(numPeaksText)
    # write m/z - abundance pairs with annotation
    f.write(peakPairsText)
    # close the file
    f.close()

# Program finished
print('\nResult MSP file converted: result-ms2.msp\n')

The metadata fields for NIST MS search MSP files are recoded here in the NIST MS Help.

Tobias

hechth commented 1 year ago

+1 on this issue - I think having native MSP export would be great.