Nesvilab / FragPipe

A cross-platform Graphical User Interface (GUI) for running MSFragger and Philosopher - powered pipeline for comprehensive analysis of shotgun proteomics data
http://fragpipe.nesvilab.org
Other
179 stars 37 forks source link

mzXML specified in output pep.xml for mzML input #88

Closed alephreish closed 5 years ago

alephreish commented 5 years ago

Describe the problem

pep.xml generated with MSFragger for an mzML input erroneously specifies mzXML as input format:

$ java -jar MSFragger.jar msfragger.param data.mzML &> msfragger.log
$ grep msms_run_summary data.pep.xml
<msms_run_summary base_name="data" raw_data_type="raw" raw_data=".mzXML">
$ head -n24 data.mzML 
<?xml version="1.0" encoding="utf-8"?>
<indexedmzML xmlns="http://psi.hupo.org/ms/mzml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.2_idx.xsd">
  <mzML xmlns="http://psi.hupo.org/ms/mzml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.0.xsd" id="data" version="1.1.0">
    <cvList count="2">
      <cv id="MS" fullName="Proteomics Standards Initiative Mass Spectrometry Ontology" version="4.1.12" URI="https://raw.githubusercontent.com/HUPO-PSI/psi-ms-CV/master/psi-ms.obo"/>
      <cv id="UO" fullName="Unit Ontology" version="09:04:2014" URI="https://raw.githubusercontent.com/bio-ontology-research-group/unit-ontology/master/unit.obo"/>
    </cvList>
    <fileDescription>
      <fileContent>
        <cvParam cvRef="MS" accession="MS:1000580" name="MSn spectrum" value=""/>
        <cvParam cvRef="MS" accession="MS:1000127" name="centroid spectrum" value=""/>
      </fileContent>
      <sourceFileList count="1">
        <sourceFile id="data.mgf" name="data.mgf" location="input">
          <cvParam cvRef="MS" accession="MS:1000774" name="multiple peak list nativeID format" value=""/>
          <cvParam cvRef="MS" accession="MS:1001062" name="Mascot MGF format" value=""/>
        </sourceFile>
      </sourceFileList>
    </fileDescription>
    <softwareList count="1">
      <software id="pwiz_3.0.18256" version="3.0.18256">
        <cvParam cvRef="MS" accession="MS:1000615" name="ProteoWizard software" value=""/>
      </software>
    </softwareList>

For TPP it's not a big problem since the correct file is eventually located:

$ xinteract data.pep.xml 2>&1 1>/dev/null | head -n4
WARNING: cannot open data file /my/path/data.mzXML in msms_run_summary tag... trying .mzML ...
SUCCESS: CORRECTED data file /my/path/data.mzML in msms_run_summary tag...
INFO: Results written to file: /my/path/interact.pep.xml
  - Building Commentz-Walter keyword tree... PeptideProphet  (TPP v5.1.0 Syzygy, Build 201812092113-7877 (Linux-x86_64)) AKeller@ISB

System info

You can find that printed on the Config tab.


Describe your experiment

Not relevant/on demand


Attach fragger.params file

Not relevant/on demand

Run log output

Not relevant

guoci commented 5 years ago

I have fixed that on the next release.