OpenMS / OpenMS

The codebase of the OpenMS project
https://www.openms.de
Other
478 stars 314 forks source link

pyopenMS InstrumentSettings() #4696

Open JeffEdge opened 4 years ago

JeffEdge commented 4 years ago

Hi all! I am using pyopenMS to create a LCMS file (following heavy preprocessing) for upload by a postprocessing tool. This tool is rather picky and I need quite a few classes. I have been using the MSspectrum() class rather successfully with the exception of setLevel() which behaves unoptimally when two objects of different level are added to an exp. If the first object uses setLevel(1) than all subsequently added objects partly inherit the a level of 1. This is not the case for level >1. This can be visualized in TOPPview where the result/display is also different from a standard (instrument generated LC-MS file). BUT MY MAIN QUESTION is how to use the InstrumentSettings() constructor to create an object enabling setPolarity(), setScanMode(), setScanWindows()? Basically I want to change the corresponding parameters and save the metaData as well as the Data to a mzML file. I am exploring CachedmzML.store() but I badly in need of a working example! Thanks for the very python library!

JeffEdge commented 4 years ago

Dear Timo, please find the code snippet below (and the resulting mzML at the end). Sorry the insert code formatting does not work, I added comments as to where to indent. EDIT: fixed The mzML templating is ugly but pasting works: it can be open using TOPPview. Thanks!

from pyopenms import *
import numpy as np

file2 = MzMLFile()
filename_in  = "Example.mzML"
exp = MSExperiment()
file2.load(filename_in, exp)
filename_out = "output.mzML"

s1 = MSSpectrum()
s1.setRT(0.5)
s1.setMSLevel(1)

for spec in exp:
    for peak in spec:
        mz = peak.getMZ()
        inten = peak.getIntensity()
        if mz > 47493.00 and mz < 47516.00:
            p1 = Peak1D()
            p1.setMZ(mz)
            p1.setIntensity(inten)
            s1.push_back(p1)

s = MSSpectrum()
s.setRT(1.0)
s.setMSLevel(2)

for spec2 in exp:
    p = Precursor()
    p.setIsolationWindowLowerOffset(5.0)
    p.setIsolationWindowUpperOffset(5.0)
    p.setMZ(47504.51) 
    p.setActivationEnergy(40) 
    p.setCharge(1) 
    s.setPrecursors( [p] )
    for peak2 in spec2:
        mz2 = peak2.getMZ()
        inten2 = peak2.getIntensity()
        if mz2 > 9500 and mz2 < 9600:
            p2 = Peak1D()
            p2.setMZ(mz2)
            p2.setIntensity(inten2)
            s.push_back(p2)

e = MSExperiment()
s1.setMSLevel(1)
s1.updateRanges()
s1.setType(1)
e.addSpectrum(s1)

s.setMSLevel(2)
s.updateRanges()
s.setType(1)
e.addSpectrum(s)

file2.store(filename_out, e)

I also attach the resulting mzML file. I think I need to use InstrumentSettings or SpectrumSettings as not all "codes" are updated using MSspectrum() and setMSlevel(). needs to differ for MSnSpectrum

<?xml version="1.0" encoding="ISO-8859-1"?>
<indexedmzML xmlns="http://psi.hupo.org/ms/mzml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.0_idx.xsd">
<mzML xmlns="http://psi.hupo.org/ms/mzml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.0.xsd" accession="" version="1.1.0">
    <cvList count="5">
        <cv id="MS" fullName="Proteomics Standards Initiative Mass Spectrometry Ontology" URI="http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo"/>
        <cv id="UO" fullName="Unit Ontology" URI="http://obo.cvs.sourceforge.net/obo/obo/ontology/phenotype/unit.obo"/>
        <cv id="BTO" fullName="BrendaTissue545" version="unknown" URI="http://www.brenda-enzymes.info/ontology/tissue/tree/update/update_files/BrendaTissueOBO"/>
        <cv id="GO" fullName="Gene Ontology - Slim Versions" version="unknown" URI="http://www.geneontology.org/GO_slims/goslim_goa.obo"/>
        <cv id="PATO" fullName="Quality ontology" version="unknown" URI="http://obo.cvs.sourceforge.net/*checkout*/obo/obo/ontology/phenotype/quality.obo"/>
    </cvList>
    <fileDescription>
        <fileContent>
            <cvParam cvRef="MS" accession="MS:1000294" name="mass spectrum" />
        </fileContent>
    </fileDescription>
    <sampleList count="1">
        <sample id="sa_0" name="">
            <cvParam cvRef="MS" accession="MS:1000004" name="sample mass" value="0" unitAccession="UO:0000021" unitName="gram" unitCvRef="UO" />
            <cvParam cvRef="MS" accession="MS:1000005" name="sample volume" value="0" unitAccession="UO:0000098" unitName="milliliter" unitCvRef="UO" />
            <cvParam cvRef="MS" accession="MS:1000006" name="sample concentration" value="0" unitAccession="UO:0000175" unitName="gram per liter" unitCvRef="UO" />
        </sample>
    </sampleList>
    <softwareList count="2">
        <software id="so_in_0" version="" >
            <cvParam cvRef="MS" accession="MS:1000799" name="custom unreleased software tool" value="" />
        </software>
        <software id="so_default" version="" >
            <cvParam cvRef="MS" accession="MS:1000799" name="custom unreleased software tool" value="" />
        </software>
    </softwareList>
    <instrumentConfigurationList count="1">
        <instrumentConfiguration id="ic_0">
            <cvParam cvRef="MS" accession="MS:1000031" name="instrument model" />
            <softwareRef ref="so_in_0" />
        </instrumentConfiguration>
    </instrumentConfigurationList>
    <dataProcessingList count="1">
        <dataProcessing id="dp_sp_0">
            <processingMethod order="0" softwareRef="so_default">
                <cvParam cvRef="MS" accession="MS:1000544" name="Conversion to mzML" />
                <userParam name="warning" type="xsd:string" value="fictional processing method used to fulfill format requirements" />
            </processingMethod>
        </dataProcessing>
    </dataProcessingList>
    <run id="ru_0" defaultInstrumentConfigurationRef="ic_0" sampleRef="sa_0">
        <spectrumList count="2" defaultDataProcessingRef="dp_sp_0">
            <spectrum id="spectrum=0" index="0" defaultArrayLength="99" dataProcessingRef="dp_sp_0">
                <cvParam cvRef="MS" accession="MS:1000127" name="centroid spectrum" />
                <cvParam cvRef="MS" accession="MS:1000511" name="ms level" value="1" />
                <cvParam cvRef="MS" accession="MS:1000294" name="mass spectrum" />
                <scanList count="1">
                    <cvParam cvRef="MS" accession="MS:1000795" name="no combination" />
                    <scan>
                        <cvParam cvRef="MS" accession="MS:1000016" name="scan start time" value="0.5" unitAccession="UO:0000010" unitName="second" unitCvRef="UO" />
                    </scan>
                </scanList>
                <binaryDataArrayList count="2">
                    <binaryDataArray encodedLength="1056">
                        <cvParam cvRef="MS" accession="MS:1000514" name="m/z array" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                        <cvParam cvRef="MS" accession="MS:1000523" name="64-bit float" />
                        <cvParam cvRef="MS" accession="MS:1000576" name="no compression" />
                        <binary>2ZahpqYw50A1qv2KrzDnQDWq/YqvMOdAz+2E2K8w50CYz7a/sDDnQBPW6WSyMOdAN3LOlc8w50DNG7XozzDnQBgsE/bPMOdAlzB3MdIw50DNJ3Dp7jDnQFGz5fjvMOdAuFUZAfAw50BqFUH68TDnQGvLlX8PMedAozdn2g8x50A6DhYNEDHnQLTPI9ERMedAa2+Psy4x50A7VMGwLzHnQCfq/SIwMedASSU0wDEx50DL/SQWRDHnQDWEht9OMedAPEcjxk8x50CP+QBJUDHnQNQAovhRMedAdek6MmIx50BA/AVhbjHnQI9D/l5vMedAvDVsSHAx50DTBFwpcjHnQBunpwiOMedAUHIh0Y8x50A6BUAakDHnQPOw5gSSMedAss36bq4x50BjHkfqrzHnQF1SevWvMedAtDlfrrEx50Cfu7R/zjHnQJPwdbzPMedA5uUA8s8x50CM68FQ0THnQEBOhFbvMedAKBJi9e8x50BUGks38DHnQMADsebwMedA904wBA8y50B9m0QKEDLnQJA9WS4QMudAeQnreRAy50BT2uw/LzLnQMTgdCMwMudAfg3pJDAy50AEipB5MDLnQOBm/u9PMudAv2y+MVAy50Dbm45JUDLnQN2N6PxQMudA0HACwG8y50D+/Pg7cDLnQIgtLWRwMudAFjcvnHAy50BlPwy7jzLnQHPnoySQMudAT3J1ZpAy50Dn8SsUkTLnQBzKa3ivMudA29ca068y50CQE89GsTLnQMl2D6SxMudANfemI88y50C0GwqXzzLnQD1m25/PMudAc1VyAtEy50BOABMv0TLnQKocWcTuMudAlTEbDO8y50DKplKG7zLnQGTujjHwMudAUSp9g/Ey50C1VHZjDjPnQLl6VGsOM+dAS4N/fw8z50C3QZsMETPnQL2mvroRM+dABJHUoy4z50BPrHiDLzPnQHkuWl0xM+dABDXVnDEz50BbhpBKTjPnQOJXaXpPM+dAmtKyZFAz50AZHiY6UTPnQIdB/NxtM+dAApunW28z50DQ60FgcDPnQPaDJXhwM+dA</binary>
                    </binaryDataArray>
                    <binaryDataArray encodedLength="528">
                        <cvParam cvRef="MS" accession="MS:1000515" name="intensity array" unitAccession="MS:1000131" unitName="number of detector counts" unitCvRef="MS"/>
                        <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float" />
                        <cvParam cvRef="MS" accession="MS:1000576" name="no compression" />
                        <binary>GWAwRj3SZkRG1o9EwjieRiaRl0VSVglH0wiSRA2cpkbvV5BF7sUeR/pkmEQQwLdGECSlRTSaOkfFYmtEqgWjRXmj00astVVHCLWLRP6uuEU6IOFGHTdtR79Q9EYHxr5E0CXPRUmeAkf834BHgKcVR7wb0EQwpABG+A8VR3Z9mEeskexESNkbRtP7I0feh6xH7McZRWVIK0Z86D5H+rLDR7FmTkXum09GScNZRyIw00fYFEVF2SJuR4yXb0YPuN1HB22HRVtUe0dI8HlGf3HlRwxjhEXL+H9Hn2jiR64BikYiNdVHbcp9RyjshUV/6ohGLeLGR83ib0caeotFM1uGRmZuvEdavVRHgkCJRaBdgEZ1G6xH8y9CRxZDWkbApWtFB4qbR1DqhUeMsi9HQtFDRiLgaEVqOopHsd9/Ry9+GUfyEEJFHEQwRpVFXUcBAW5H2bMKRyuiDEYsuBZFpf42RwJ55UYekf1FwIUHRbc7QEcy6MNGhCfjRBT2tkVFBDZHWqKzRoC1tURt6bJF</binary>
                    </binaryDataArray>
                </binaryDataArrayList>
            </spectrum>
            <spectrum id="spectrum=1" index="1" defaultArrayLength="12">
                <cvParam cvRef="MS" accession="MS:1000127" name="centroid spectrum" />
                <cvParam cvRef="MS" accession="MS:1000511" name="ms level" value="2" />
                <cvParam cvRef="MS" accession="MS:1000294" name="mass spectrum" />
                <scanList count="1">
                    <cvParam cvRef="MS" accession="MS:1000795" name="no combination" />
                    <scan>
                        <cvParam cvRef="MS" accession="MS:1000016" name="scan start time" value="1" unitAccession="UO:0000010" unitName="second" unitCvRef="UO" />
                    </scan>
                </scanList>
                <precursorList count="1">
                    <precursor>
                        <isolationWindow>
                            <cvParam cvRef="MS" accession="MS:1000827" name="isolation window target m/z" value="47504.51" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                            <cvParam cvRef="MS" accession="MS:1000828" name="isolation window lower offset" value="5" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                            <cvParam cvRef="MS" accession="MS:1000829" name="isolation window upper offset" value="5" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                        </isolationWindow>
                        <selectedIonList count="1">
                            <selectedIon>
                                <cvParam cvRef="MS" accession="MS:1000744" name="selected ion m/z" value="47504.51" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                                <cvParam cvRef="MS" accession="MS:1000041" name="charge state" value="1" />
                            </selectedIon>
                        </selectedIonList>
                        <activation>
                            <cvParam cvRef="MS" accession="MS:1000509" name="activation energy" value="40" unitAccession="UO:0000266" unitName="electronvolt" unitCvRef="UO" />
                            <cvParam cvRef="MS" accession="MS:1000044" name="dissociation method" />
                        </activation>
                    </precursor>
                </precursorList>
                <binaryDataArrayList count="2">
                    <binaryDataArray encodedLength="128">
                        <cvParam cvRef="MS" accession="MS:1000514" name="m/z array" unitAccession="MS:1000040" unitName="m/z" unitCvRef="MS" />
                        <cvParam cvRef="MS" accession="MS:1000523" name="64-bit float" />
                        <cvParam cvRef="MS" accession="MS:1000576" name="no compression" />
                        <binary>d9LudVSOwkD+QVPS1I7CQGmkc/pTj8JACk1GxtWPwkDd6IRuVJDCQMHXJFzWkMJAeQXeWFeRwkBOkzbf2JHCQBOhdrJYksJAlc+bi9eSwkDthE8wWJPCQHV18xvXk8JA</binary>
                    </binaryDataArray>
                    <binaryDataArray encodedLength="64">
                        <cvParam cvRef="MS" accession="MS:1000515" name="intensity array" unitAccession="MS:1000131" unitName="number of detector counts" unitCvRef="MS"/>
                        <cvParam cvRef="MS" accession="MS:1000521" name="32-bit float" />
                        <cvParam cvRef="MS" accession="MS:1000576" name="no compression" />
                        <binary>Vu8EQm5Ng0Ei77VBstQUQnKK1UHez9ZB5Ct5QkVyD0Kml7pB0KxAQtANcEGO8l5B</binary>
                    </binaryDataArray>
                </binaryDataArrayList>
            </spectrum>
        </spectrumList>
    </run>
</mzML>
<indexList count="1">
    <index name="spectrum">
        <offset idRef="spectrum=0">3028</offset>
        <offset idRef="spectrum=1">6044</offset>
    </index>
</indexList>
<indexListOffset>8889</indexListOffset>
<fileChecksum>0</fileChecksum>
</indexedmzML>
JeffEdge commented 4 years ago

For the first part of the question: use for instance s1.setMSLevel(3) as comparison. It appears twice (I just wanted to make sure :-)

jpfeuffer commented 4 years ago

Hey Jeff, I updated the code blocks for you. You just needed to change "`" to \"```" for code blocks. Adding "python" after \"```" also enables syntax highlighting. Thanks for the snippets.

JeffEdge commented 4 years ago

Thanks!!!