robinparky / prolucidComPIL

2 stars 1 forks source link

ProLuCIDComPIL

We modified original Prolucid search engine to be compatible to ComPIL metaproteomics data analysis.

ProLuCIDCompil can be download here: ProLuCIDComPIL.jar.

File formats

MS2 and SQT are plaintext file formats detailed in the following publication:

McDonald, W. H. et al. MS1, MS2, and SQT-three unified, compact, and easily parsed file formats for the storage of shotgun proteomic spectra and identifications. Rapid Commun. Mass Spectrom. 18, 2162–2168 (2004).

MS2 files can be generated from instrument RAW files using a tool such as RawConverter

Input file format

ProLuCIDComPIL takes an MS2 file as input. MS2 files contain MS/MS precursor ion, charge, and fragment information:

MS2 Format
S       000040  000040  960.22797
I       RetTime 0.25
I       PrecursorInt    6606.3
I       IonInjectionTime        150.000
I       ActivationType  HCD
I       PrecursorFile   MSMS_sample.ms1
I       PrecursorScan   34
I       InstrumentType  FTMS
Z       4       3837.89004
109.4537 168.2 0
111.1992 175.5 0
112.6070 188.2 0
136.0749 575.7 0
143.1249 190.1 0
152.1059 178.3 0
...

Output file format

ProLuCIDComPIL outputs search results in the SQT file format, which contains unfiltered proteomic scoring information, including the best scoring peptide matches for each scan, parent proteins for each matched peptide, and other search-related information.

SQT Format
S       10210   [information for scan #10210]
M       1       [best scoring peptide match]
L       [parent protein for peptide match 1]
L       [parent protein for peptide match 1]
L       [parent protein for peptide match 1]
M       2       [second-best scoring peptide match]
L       [parent protein for peptide match 2]
L       [parent protein for peptide match 2]
...

Running locally

*tested on CentOS 7

Requirements

Instructions


Configuration Of Build_compil:

Download build_compil

  1. go to directory where build_compil is downloaded

  2. edit ex/python/multiprocess_JSON_import.py

    • Change "HOST" and "PORT" variables to match your mongodb configuration
  3. edit create_compil

    1. Change variable "FASTADB" so that it is assigned to "${ORIGFASTADB%.}"_renumbered."${ORIGFASTADB##.}"
    2. Change "MONGO_HOST" and "MONGO_PORT" to match your mongodb configuration
      • Examples
        • "HOST = localhost"
        • "PORT = 27017"
  4. edit blazmass.params

    • Change "mongoDB_URI" parameter to match your mongodb configuration
      • Example: "mongodb://localhost:27017"

Upload Fasta file to MongoDB

ComPIL/MongoDB integration by Sandip Chatterjee & Greg Stupp

Configure Search Parameters

Download sample search.xml here.

  1. Edit " [database path]" line and replace "[database path]" with path to fasta file
    • Example:
      • /home/yateslab/project_data/prolucid_compil/2610search/example.fasta
  2. Edit "[insert database_name]" line and replace [insert database name] with database name
    • Example:
      • testDB
    • Database name should be the same as "database_name" used in step 3 in "Upload Fasta File to MongoDB" process
  3. Edit " [insert database url]" and replace [insert database url] with mongodb url
    • Example:
      • mongodb://localhost:27017
  4. Edit other parameters as necessary. The other parameters would match those of a regular prolucid search.xml

Run Search

Download ProLuCIDCompil here.

  1. Run "java -Xmx10G -jar prolucid_compil.jar example.ms2 search.xml [num_threads]"
    • search.xml - edit search.xml as described in "Configure Search Parameters"
    • [num threads] - number of threads to assign to search; in general assigning more threads to search increased performance but increased strain on mongodb server and memory usage on local node. The optimum number of threads assigned per node would heavily depend on node specification, network configuration, and mongodb sharding configuration. For our 8 shard mongodb set up, I assigned 2 threads per node and had no more than 60 threads access the mongodb server.
    • Example:
      • java -Xmx10G -jar prolucid_compil.jar example.ms2 search.xml 4

Running DTASelect