Closed Weronika77 closed 6 years ago
Are you running with the latest ipaw? can you try git pull
first? Some updates were made. Not sure if it will fix your problem, but we can look into it if it still doesn't work.
BTW, you should provide hg19 genome instead of hg38, although the pipeline still works, but you will get two versions of coordinates because the other process in the pipeline map the peptides to hg19 genome.
Hi, the pipeline is currently a bit tailored towards our lab's sample fractionation system and therefore expects filenames to have a fraction nr like this: This piece of code extracts the fraction name from the file: samplename_or_something_fr01.mzML
.it.baseName.replaceFirst(/.*fr(\d\d).*/, "\$1").toInteger()
I can have a look later to see if that part can be removed that since I don't think we actually use the fractionation in this pipe (it's just that it is so standard in the lab that it got built in).
EDIT: as it turns out, the fraction-parsing was removed in this commit , so @yafeng was right, do a git pull
to get the latest version, and then at least you will not get that particular error.
Great, the git pull worked! Thanks for the quick reply!
However, now I ran into a new error...
./nextflow run ipaw.nf --tdb VarDB.fasta --mzmls *.mzML --gtf VarDB.gtf --knownproteins Homo_sapiens.GRCh38.pep.all.fa.gz --blastdb UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta --snpfa MSCanProVar_ensemblV79.filtered.fasta --genome hg19.chr1-22.X.Y.M.fa.masked --cosmic CosmicMutantExport.tsv --outdir results
N E X T F L O W ~ version 0.28.0
Launching `ipaw.nf` [compassionate_fourier] - revision: 2f897ac094
WARN: Access to undefined parameter `dbsnp` -- Initialise it to a default value eg. `params.dbsnp = some_value`
WARN: Access to undefined parameter `mzmldef` -- Initialise it to a default value eg. `params.mzmldef = some_value`
Detected setnames: NA
[warm up] executor > local
[ac/658a74] Submitted process > concatFasta
[76/3ff5ac] Submitted process > makeTrypSeq
[3b/db0da8] Submitted process > makeProtSeq
[fc/abc54a] Submitted process > createSpectraLookup (1)
ERROR ~ Error executing process > 'concatFasta'
Caused by:
Process `concatFasta` terminated with an error exit status (127)
Command executed:
cat VarDB.fasta Homo_sapiens.GRCh38.pep.all.fa.gz > db.fa
Command exit status:
127
Command output:
(empty)
Command error:
/bin/bash: error while loading shared libraries: libtinfo.so.5: cannot open shared object file: Permission denied
Work dir:
/home/weronika/proteogenomics-analysis-workflow/work/ac/658a7450cc52d4d65e7282ef78c28e
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
-- Check '.nextflow.log' file for details
WARN: Killing pending tasks (3)
and when I looked into .comand.run
I noticed when I commented out
docker run -i -v /home/weronika/proteogenomics-analysis-workflow:/home/weronika/proteogenomics-analysis-workflow -v "$PWD":"$PWD" -w "$PWD" --entrypoint /bin/bash -u $(id -u):$(id -g)$
It wouldn't give me this error anymore, so I think there might be something wrong with the docker?
Also I was wondering, since the pipeline maps to hg19, is there also a version coming soon where it will map to hg38?
Interesting, we haven't seen that error here. I found a similar error message here. Our underlying system for running docker has been Ubuntu, but if yours is RH based maybe the linked bug report is correct that SELinux is preventing Docker's access to the library. I am not sure we can help there (and the linked bug report is filed under CANTFIX which is also not good news), maybe you have to talk with your system administrator.
cat /etc/issue
Ubuntu 16.04.4 LTS \n \l
As it shows I have got Ubuntu, so RH shouldn't be the problem.. I already talked with my administrator and he showed me that it might have something to do with the docker by commenting that line out, as said above.
Anyway, thanks for you help
I'm by no means a docker or nextflow expert, but it looks a bit like the cat
command inside docker isn't allowed by your system. Indeed commenting out docker should not remove that error.
We run Ubuntu 16.04 as well, docker version 17.12.1-ce, build 7390fc6
and the user who runs the nextflow command is member of the docker
group. If you run the pipeline with sudo
(which shouldnt be necessary if the regular user can start docker containers), does the error disappear?
So I went into the work dir folder (/home/weronika/proteogenomics-analysis-workflow/work/ac/658a7450cc52d4d65e7282ef78c28e) from the previous error and I put sudo in front of the docker command in the .command.run file. Then I ran it again and it would say the same thing about permission denied but then about this work dir: /home/weronika/proteogenomics-analysis-workflow/work/f4/c0a5eff5105afe28da036b9ea1fda0. Then I put again sudo in front of the docker command in .command.run. Then I ran it again and got the same error but again a different folder. However there are 139 folder in work, so I gave up after doing this for the 3rd time since I'm not manually gonna change those 139 .command.run files. So then I tried to put sudo in front of ./netflow and it gave me the following:
sudo ./nextflow run ipaw.nf --tdb VarDB.fasta --mzmls *.mzML --gtf VarDB.gtf --knownproteins Homo_sapiens.GRCh38.pep.all.fa.gz --blastdb UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta --snpfa MSCanProVar_ensemblV79.filtered.fasta --genome hg19.chr1-22.X.Y.M.fa.masked --cosmic CosmicMutantExport.tsv --outdir results
[sudo] password for weronika:
N E X T F L O W ~ version 0.28.0
Launching `ipaw.nf` [dreamy_hilbert] - revision: 2f897ac094
WARN: Access to undefined parameter `dbsnp` -- Initialise it to a default value eg. `params.dbsnp = some_value`
WARN: Access to undefined parameter `mzmldef` -- Initialise it to a default value eg. `params.mzmldef = some_value`
Detected setnames: NA
[warm up] executor > local
[84/963217] Submitted process > makeProtSeq
[73/937bdd] Submitted process > concatFasta
[c4/f060b8] Submitted process > makeTrypSeq
[e5/fffd82] Submitted process > createSpectraLookup (1)
ERROR ~ Error executing process > 'makeProtSeq'
Caused by:
Process `makeProtSeq` terminated with an error exit status (1)
Command executed:
msslookup protspace -i Homo_sapiens.GRCh38.pep.all.fa.gz --minlen 8
Command exit status:
1
Command output:
(empty)
Command error:
ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss
ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss
Traceback (most recent call last):
File "/usr/local/bin/msslookup", line 6, in <module>
sys.exit(app.mslookup.main())
File "/usr/local/lib/python3.6/site-packages/app/mslookup.py", line 21, in main
startup.start_msstitch(drivers, sys.argv)
File "/usr/local/lib/python3.6/site-packages/app/drivers/startup.py", line 53, in start_msstitch
args.func(**vars(args))
File "/usr/local/lib/python3.6/site-packages/app/drivers/base.py", line 74, in start
self.run()
File "/usr/local/lib/python3.6/site-packages/app/drivers/mslookup/base.py", line 35, in run
self.create_lookup()
File "/usr/local/lib/python3.6/site-packages/app/drivers/mslookup/seqspace.py", line 54, in create_lookup
self.minlength)
File "/usr/local/lib/python3.6/site-packages/app/actions/mslookup/searchspace.py", line 7, in create_searchspace_wholeproteins
prots = {str(prot.seq).replace('L', 'I'): prot.id for prot in fasta}
File "/usr/local/lib/python3.6/site-packages/app/actions/mslookup/searchspace.py", line 7, in <dictcomp>
prots = {str(prot.seq).replace('L', 'I'): prot.id for prot in fasta}
File "/usr/local/lib/python3.6/site-packages/Bio/SeqIO/__init__.py", line 609, in parse
for r in i:
File "/usr/local/lib/python3.6/site-packages/Bio/SeqIO/FastaIO.py", line 122, in FastaIterator
for title, sequence in SimpleFastaParser(handle):
File "/usr/local/lib/python3.6/site-packages/Bio/SeqIO/FastaIO.py", line 43, in SimpleFastaParser
line = handle.readline()
File "/usr/local/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128)
Work dir:
/home/weronika/proteogenomics-analysis-workflow/work/84/963217e693626b08636f0d011f4de0
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
-- Check '.nextflow.log' file for details
WARN: Killing pending tasks (3)
So unfortunately sudo doesn't solve my problems.
Hi, I can't be 100% sure but from the filename it looks like the --knownproteins Homo_sapiens.GRCh38.pep.all.fa.gz
file needs to be unzipped.
Also sudo ./nextflow ...
as you did works fine, you dont have to do it in each workdir. It results in that all the docker-launching commands are run as sudo. It may also work if you add sudo: True
to the config file as we used to have (it was removed here). Running everything as sudo will make your work directories also have root ownership though, which can be problematic if you later run something without sudo which cannot access them, but I guess a working pipeline is more important right now :).
I have now used the unzipped version for --knownproteins
. And it did get me further! Although you should probably change that in your README file, though.
Unfortunately, I ran into a new error.
sudo ./nextflow run ipaw.nf --tdb VarDB.fasta --mzmls *.mzML --gtf VarDB.gtf --knownproteins Homo_sapiens.GRCh38.pep.all.fa --blastdb UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta --snpfa MSCanProVar_ensemblV79.filtered.fasta --genome hg19.chr1-22.X.Y.M.fa.masked --cosmic CosmicMutantExport.tsv --outdir results
[sudo] password for weronika:
N E X T F L O W ~ version 0.28.0
Launching `ipaw.nf` [sharp_goldstine] - revision: 2f897ac094
WARN: Access to undefined parameter `dbsnp` -- Initialise it to a default value eg. `params.dbsnp = some_value`
WARN: Access to undefined parameter `mzmldef` -- Initialise it to a default value eg. `params.mzmldef = some_value`
Detected setnames: NA
[warm up] executor > local
[64/dd0905] Submitted process > makeTrypSeq
[7b/206d15] Submitted process > makeProtSeq
[78/d5b6b2] Submitted process > concatFasta
[da/24b225] Submitted process > createSpectraLookup (1)
[bb/d53822] Submitted process > makeDecoyReverseDB
[af/af784a] Submitted process > msgfPlus (1)
ERROR ~ Error executing process > 'msgfPlus (1)'
Caused by:
Missing output file(s) `TCAM2.mzid` expected by process `msgfPlus (1)`
Command executed:
msgf_plus -Xmx16G -d concatdb.fasta -s TCAM2.mzML -o "TCAM2.mzid" -thread 12 -mod Mods.txt -tda 0 -t 10.0ppm -ti -1,2 -m 0 -inst 3 -e 1 -protocol null -ntt 2 -minLength 7 -maxLength 50 -minCharge 2 -maxCharge 6 -n 1 -addFeatures 1
msgf_plus -Xmx3500M edu.ucsd.msjava.ui.MzIDToTsv -i "TCAM2.mzid" -o out.mzid.tsv
Command exit status:
0
Command output:
MS-GF+ Release (v2016.10.26) (26 Oct 2016)
Usage: java -Xmx3500M -jar MSGFPlus.jar
-s SpectrumFile (*.mzML, *.mzXML, *.mgf, *.ms2, *.pkl or *_dta.txt)
-d DatabaseFile (*.fasta or *.fa)
[-o OutputFile (*.mzid)] (Default: [SpectrumFileName].mzid)
[-t PrecursorMassTolerance] (e.g. 2.5Da, 20ppm or 0.5Da,2.5Da, Default: 20ppm)
Use comma to set asymmetric values. E.g. "-t 0.5Da,2.5Da" will set 0.5Da to the minus (expMass<theoMass) and 2.5Da to plus (expMass>theoMass)
[-ti IsotopeErrorRange] (Range of allowed isotope peak errors, Default:0,1)
Takes into account of the error introduced by chooosing a non-monoisotopic peak for fragmentation.
The combination of -t and -ti determins the precursor mass tolerance.
E.g. "-t 20ppm -ti -1,2" tests abs(exp-calc-n*1.00335Da)<20ppm for n=-1, 0, 1, 2.
[-thread NumThreads] (Number of concurrent threads to be executed, Default: Number of available cores)
[-tda 0/1] (0: don't search decoy database (Default), 1: search decoy database)
[-m FragmentMethodID] (0: As written in the spectrum or CID if no info (Default), 1: CID, 2: ETD, 3: HCD, 4: UVPD)
[-inst MS2DetectorID] (0: Low-res LCQ/LTQ (Default), 1: Orbitrap/FTICR, 2: TOF, 3: Q-Exactive)
[-e EnzymeID] (0: unspecific cleavage, 1: Trypsin (Default), 2: Chymotrypsin, 3: Lys-C, 4: Lys-N, 5: glutamyl endopeptidase, 6: Arg-C, 7: Asp-N, 8: alphaLP, 9: no cleavage)
[-protocol ProtocolID] (0: Automatic (Default), 1: Phosphorylation, 2: iTRAQ, 3: iTRAQPhospho, 4: TMT, 5: Standard)
[-ntt 0/1/2] (Number of Tolerable Termini, Default: 2)
E.g. For trypsin, 0: non-tryptic, 1: semi-tryptic, 2: fully-tryptic peptides only.
[-mod ModificationFileName] (Modification file, Default: standard amino acids with fixed C+57)
[-minLength MinPepLength] (Minimum peptide length to consider, Default: 6)
[-maxLength MaxPepLength] (Maximum peptide length to consider, Default: 40)
[-minCharge MinCharge] (Minimum precursor charge to consider if charges are not specified in the spectrum file, Default: 2)
[-maxCharge MaxCharge] (Maximum precursor charge to consider if charges are not specified in the spectrum file, Default: 3)
[-n NumMatchesPerSpec] (Number of matches per spectrum to be reported, Default: 1)
[-addFeatures 0/1] (0: output basic scores only (Default), 1: output additional features)
[-ccm ChargeCarrierMass] (Mass of charge carrier, Default: mass of proton (1.00727649))
Example (high-precision): java -Xmx3500M -jar MSGFPlus.jar -s test.mzXML -d IPI_human_3.79.fasta -t 20ppm -ti -1,2 -ntt 2 -tda 1 -o testMSGFPlus.mzid
Example (low-precision): java -Xmx3500M -jar MSGFPlus.jar -s test.mzXML -d IPI_human_3.79.fasta -t 0.5Da,2.5Da -ntt 2 -tda 1 -o testMSGFPlus.mzid
MzIDToTsv v9108 (26 Oct 2016)
Usage: java -Xmx3500M -cp MSGFPlus.jar edu.ucsd.msjava.ui.MzIDToTsv
-i MzIDPath (MS-GF+ output file (*.mzid) or directory containing mzid files)
[-o TSVFile] (TSV output file (*.tsv) (Default: MzIDFileName.tsv))
[-showQValue 0/1] (0: do not show Q-values, 1: show Q-values (Default))
[-showDecoy 0/1] (0: do not show decoy PSMs (Default), 1: show decoy PSMs)
[-showFormula 0/1] (0: do not show molecular formula (Default), 1: show molecular formula of peptides)
[-unroll 0/1] (0: merge shared peptides (Default), 1: unroll shared peptides)
Command error:
ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss
[Error] Invalid value for parameter -protocol: null (must be an integer)
[Error] Invalid value for parameter -i: TCAM@.mzid (file does not exist)
Work dir:
/home/weronika/proteogenomics-analysis-workflow/work/af/af784ac36c35fa68e3d788dc8aeefc
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
-- Check '.nextflow.log' file for details
Also about the sudo:True should I add that in the config file under standard/docker:
standard {
docker {
enabled = true
fixOwnership = true
runOptions = "-u \$(id -u):\$(id -g)"
}
or under slurm/docker:
slurm {
docker {
enabled = true
fixOwnership = true
runOptions = "-u \$(id -u):\$(id -g)"
}
or both?
Thank you, you found a bug! I am guessing you are using labelfree data, apparently if you do not specify --isobaric ...
the pipeline does not pick a labelfree -protocol
for MSGFPlus. I just pushed an update, try to git pull
and rerun.
Two more things you can add to the ./nextflow run
command that I see:
--dbsnp /path/to/snp142CodingDbSnp.txt
without this the pipeline will error.
-resume
very handy, nextflow will skip the parts that have already ran in a previous run
The slurm
and standard
are config profiles that nextflow uses. You can add it to both if you like. If you dont specify which one to use with -profile
it will use standard
. If you have to queue jobs to SLURM on a cluster you can add -profile slurm
to nextflow run ...
.
Even though I updated the code and did everything you said, I unfortunately keep running into a new error:
sudo ./nextflow run ipaw.nf --resume --tdb VarDB.fasta --mzmls *.mzML --gtf VarDB.gtf --knownproteins Homo_sapiens.GRCh38.pep.all.fa --blastdb UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta --snpfa MSCanProVar_ensemblV79.filtered.fasta --genome hg19.chr1-22.X.Y.M.fa.masked --dbsnp snp142CodingDbSnp.txt --cosmic CosmicMutantExport.tsv --outdir results
N E X T F L O W ~ version 0.28.0
Launching `ipaw.nf` [intergalactic_kare] - revision: baa0e7b888
WARN: Access to undefined parameter `mzmldef` -- Initialise it to a default value eg. `params.mzmldef = some_value`
Detected setnames: NA
[warm up] executor > local
[ec/e71907] Submitted process > concatFasta
[c2/1dead3] Submitted process > makeProtSeq
[63/2d14e4] Submitted process > makeTrypSeq
[9b/388592] Submitted process > createSpectraLookup (1)
[b4/b0007d] Submitted process > makeDecoyReverseDB
[93/f9c06c] Submitted process > msgfPlus (1)
[b0/c9df8e] Submitted process > percolator (1)
[7c/ee33b3] Submitted process > filterPercolator (1)
[c2/38a4d8] Submitted process > svmToTSV (1)
[3c/3e26c9] Submitted process > svmToTSV (2)
ERROR ~ Error executing process > 'svmToTSV (1)'
Caused by:
Process `svmToTSV (1)` terminated with an error exit status (1)
Command executed:
#!/usr/bin/env python
from glob import glob
mzidtsvfns = sorted(glob('mzidtsv*'))
mzidfns = sorted(glob('mzident*'))
from app.readers import pycolator, xml, tsv, mzidplus
import os
ns = xml.get_namespace_from_top('fp_th0.xml', None)
psms = {p.attrib['{%s}psm_id' % ns['xmlns']]: p for p in pycolator.generate_psms('fp_th0.xml', ns)}
decoys = {True: 0, False: 0}
for psm in sorted([(pid, float(p.find('{%s}svm_score' % ns['xmlns']).text), p) for pid, p in psms.items()], reverse=True, key=lambda x:x[1]):
pdecoy = psm[2].attrib['{%s}decoy' % ns['xmlns']] == 'true'
decoys[pdecoy] += 1
psms[psm[0]] = {'decoy': pdecoy, 'svm': psm[1], 'qval': decoys[True]/decoys[False]} # T-TDC
decoys = {'true': 0, 'false': 0}
for svm, pep in sorted([(float(x.find('{%s}svm_score' % ns['xmlns']).text), x) for x in pycolator.generate_peptides('fp_th0.xml', ns)], reverse=True, key=lambda x:x[0]):
decoys[pep.attrib['{%s}decoy' % ns['xmlns']]] += 1
[psms[pid.text].update({'pepqval': decoys['true']/decoys['false']}) for pid in pep.find('{%s}psm_ids' % ns['xmlns'])]
oldheader = tsv.get_tsv_header(mzidtsvfns[0])
header = oldheader + ['percolator svm-score', 'PSM q-value', 'peptide q-value']
with open('mzidperco', 'w') as fp:
fp.write('\t'.join(header))
for fnix, mzidfn in enumerate(mzidfns):
mzns = mzidplus.get_mzid_namespace(mzidfn)
siis = (sii for sir in mzidplus.mzid_spec_result_generator(mzidfn, mzns) for sii in sir.findall('{%s}SpectrumIdentificationItem' % mzns['xmlns']))
for specidi, psm in zip(siis, tsv.generate_tsv_psms(mzidtsvfns[fnix], oldheader)):
# percolator psm ID is: samplename_SII_scannr_rank_scannr_charge_rank
print(specidi)
print(psm)
scan, rank = specidi.attrib['id'].replace('SII_', '').split('_')
outpsm = {k: v for k,v in psm.items()}
spfile = os.path.splitext(psm['#SpecFile'])[0]
try:
percopsm = psms['{fn}_SII_{sc}_{rk}_{sc}_{ch}_{rk}'.format(fn=spfile, sc=scan, rk=rank, ch=psm['Charge'])]
except KeyError:
continue
if percopsm['decoy']:
continue
fp.write('\n')
outpsm.update({'percolator svm-score': percopsm['svm'], 'PSM q-value': percopsm['qval'], 'peptide q-value': percopsm['pepqval']})
fp.write('\t'.join([str(outpsm[k]) for k in header]))
Command exit status:
1
Command output:
(empty)
Command error:
ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss
Traceback (most recent call last):
File ".command.sh", line 13, in <module>
psms[psm[0]] = {'decoy': pdecoy, 'svm': psm[1], 'qval': decoys[True]/decoys[False]} # T-TDC
ZeroDivisionError: division by zero
Work dir:
/home/weronika/proteogenomics-analysis-workflow/work/c2/38a4d8b7dc4b2ded5920fef75fbfa8
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
-- Check '.nextflow.log' file for details
WARN: Killing pending tasks (1)
So I have updated the code with git pull and I have added the --dbsnp
. I also added the sudo = true
in the config files under both slurm
and standard
. Although I still had to use the sudo in front of the ./nextflow command. So I'm not sure if that had any effect?
And if i add the --isobaric ...
I get the following error:
sudo ./nextflow run ipaw.nf --resume --tdb VarDB.fasta --mzmls *.mzML --gtf VarDB.gtf --knownproteins Homo_sapiens.GRCh38.pep.all.fa --blastdb UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta --snpfa MSCanProVar_ensemblV79.filtered.fasta --genome hg19.chr1-22.X.Y.M.fa.masked --isobaric ... --dbsnp snp142CodingDbSnp.txt --cosmic CosmicMutantExport.tsv --outdir results
N E X T F L O W ~ version 0.28.0
Launching `ipaw.nf` [prickly_babbage] - revision: baa0e7b888
WARN: Access to undefined parameter `mzmldef` -- Initialise it to a default value eg. `params.mzmldef = some_value`
WARN: Access to undefined parameter `denoms` -- Initialise it to a default value eg. `params.denoms = some_value`
Detected setnames: NA
[warm up] executor > local
[b3/18ad0d] Submitted process > makeProtSeq
[c1/8c784b] Submitted process > concatFasta
[02/8feea9] Submitted process > makeTrypSeq
[14/5ac39b] Submitted process > IsobaricQuant (1)
[8d/6bb322] Submitted process > makeDecoyReverseDB
ERROR ~ Error executing process > 'IsobaricQuant (1)'
Caused by:
Process `IsobaricQuant (1)` terminated with an error exit status (6)
Command executed:
IsobaricAnalyzer -type ... -in TCAM2.mzML -out "TCAM2.mzML.consensusXML" -extraction:select_activation "High-energy collision-induced dissociation" -extraction:reporter_mass_shift null -extraction:min_precursor_intensity 1.0 -extraction:keep_unannotated_precursor true -quantification:isotope_correction true
Command exit status:
6
Command output:
Invalid parameter values (ConversionError): Could not convert string 'null' to a double value. Aborting!
Command error:
a7f760de4b27: Already exists
d836c29a56fb: Already exists
6c2ebb6634fc: Already exists
00f810677cff: Already exists
531ebc5af9ff: Already exists
a3ed95caeb02: Already exists
aef3b3b2fa0d: Already exists
05c89845ef18: Pulling fs layer
05c89845ef18: Verifying Checksum
05c89845ef18: Download complete
05c89845ef18: Pull complete
Digest: sha256:2373b8c92a79f51a3833576b24629a3fadb0140b30fb70f9c4cfa18c6d7a3641
Status: Downloaded newer image for quay.io/biocontainers/openms:2.2.0--py27_boost1.64_0
ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss
ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss
stty: standard input: Inappropriate ioctl for device
IsobaricAnalyzer -- Calculates isobaric quantitative values for peptides
Version: 2.2.0 Jul 10 2017, 11:42:37, Revision: HEAD-HASH-NOTFOUND
Usage:
IsobaricAnalyzer <options>
This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed description or use the --helphelp option.
Options (mandatory options marked with '*'):
-type <mode> Isobaric Quantitation method used in the experiment. (default: 'itraq4plex' valid: 'itraq4plex', 'itraq8plex', 'tmt10plex', 'tmt6plex')
-in <file>* Input raw/picked data file (valid formats: 'mzML')
-out <file>* Output consensusXML file with quantitative information (valid formats: 'consensusXML')
Common TOPP options:
-ini <file> Use the given TOPP INI file
-threads <n> Sets the number of threads allowed to be used by the TOPP tool (default: '1')
-write_ini <file> Writes the default configuration file
-id_pool <file> ID pool file to DocumentID's for all generated output files. Disabled by default. (Set to 'main' to use /usr/local/share/OpenMS/IDPool/IDPool.txt)
--help Shows options
--helphelp Shows all options (including advanced)
The following configuration subsections are valid:
- extraction Parameters for the channel extraction.
- itraq4plex Algorithm parameters for iTRAQ 4-plex
- itraq8plex Algorithm parameters for iTRAQ 8-plex
- quantification Parameters for the peptide quantification.
- tmt10plex Algorithm parameters for TMT 10-plex
- tmt6plex Algorithm parameters for TMT 6-plex
You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
Have a look at the OpenMS documentation for more information.
Work dir:
/home/weronika/proteogenomics-analysis-workflow/work/14/5ac39b0f717e9d15dc9e746ebbd049
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
-- Check '.nextflow.log' file for details
WARN: Killing pending tasks (3)
Sorry, I meant if you have isobaric data, use --isobaric tmt10plex
for tmt10plex data, --isobaric itraq8plex
for itraq data, etc. If you have labelfree data, the git pull
should now get you a proper labelfree protocol for MSGF+. Is this labelfree data?
But you got the first error also. What I forgot to mention (and it is not in the README), is that if you have labelfree data, you will also need to specify another modification --mod your_modfile.txt
. Basically you can use Mods.txt
and remove the lines with tmt6plex in them. This will give better PSMs and MAYBE solve the first problem.
Yes, the data is labelfree. I made a new modfile and removed the lines with tmt6plex. Now I got this error:
sudo ./nextflow run ipaw.nf -resume \
--tdb VarDB.fasta \ --mzmls TCAM2.mzML \ --gtf VarDB.gtf \ --knownproteins Homo_sapiens.GRCh38.pep.all.fa \ --blastdb UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta \ --snpfa MSCanProVar_ensemblV79.filtered.fasta \ --genome hg19.chr1-22.X.Y.M.fa.masked \ --dbsnp snp142CodingDbSnp.txt \ --cosmic CosmicMutantExport.tsv \ --mod new_mods.txt \ --outdir results [sudo] password for weronika: N E X T F L O W ~ version 0.28.0 Launching
ipaw.nf
[mad_yonath] - revision: baa0e7b888 WARN: Access to undefined parametermzmldef
-- Initialise it to a default value eg.params.mzmldef = some_value
Detected setnames: NA [warm up] executor > local [59/a2e899] Cached process > makeProtSeq [93/28b657] Cached process > makeTrypSeq [89/aaa546] Cached process > concatFasta [33/807ea5] Cached process > createSpectraLookup (1) [66/305130] Cached process > makeDecoyReverseDB [d0/19d122] Submitted process > msgfPlus (1) ERROR ~ Error executing process > 'msgfPlus (1)'Caused by: Missing output file(s)
TCAM2.mzid
expected by processmsgfPlus (1)
Command executed:
msgf_plus -Xmx16G -d concatdb.fasta -s TCAM2.mzML -o "TCAM2.mzid" -thread 12 -mod Mods.txt -tda 0 -t 10.0ppm -ti -1,2 -m 0 -inst 3 -e 1 -protocol 0 -ntt 2 -minLength 7 -maxLength 50 -minCharge 2 -maxCharge 6 -n 1 -addFeatures 1 msgf_plus -Xmx3500M edu.ucsd.msjava.ui.MzIDToTsv -i "TCAM2.mzid" -o out.mzid.tsv
Command exit status: 0
Command output:
MS-GF+ Release (v2016.10.26) (26 Oct 2016) Usage: java -Xmx3500M -jar MSGFPlus.jar -s SpectrumFile (.mzML, .mzXML, .mgf, .ms2, .pkl or _dta.txt) -d DatabaseFile (.fasta or .fa) [-o OutputFile (.mzid)] (Default: [SpectrumFileName].mzid) [-t PrecursorMassTolerance] (e.g. 2.5Da, 20ppm or 0.5Da,2.5Da, Default: 20ppm) Use comma to set asymmetric values. E.g. "-t 0.5Da,2.5Da" will set 0.5Da to the minus (expMass<theoMass) and 2.5Da to plus (expMass>theoMass) [-ti IsotopeErrorRange] (Range of allowed isotope peak errors, Default:0,1) Takes into account of the error introduced by chooosing a non-monoisotopic peak for fragmentation. The combination of -t and -ti determins the precursor mass tolerance. E.g. "-t 20ppm -ti -1,2" tests abs(exp-calc-n1.00335Da)<20ppm for n=-1, 0, 1, 2. [-thread NumThreads] (Number of concurrent threads to be executed, Default: Number of available cores) [-tda 0/1] (0: don't search decoy database (Default), 1: search decoy database) [-m FragmentMethodID] (0: As written in the spectrum or CID if no info (Default), 1: CID, 2: ETD, 3: HCD, 4: UVPD) [-inst MS2DetectorID] (0: Low-res LCQ/LTQ (Default), 1: Orbitrap/FTICR, 2: TOF, 3: Q-Exactive) [-e EnzymeID] (0: unspecific cleavage, 1: Trypsin (Default), 2: Chymotrypsin, 3: Lys-C, 4: Lys-N, 5: glutamyl endopeptidase, 6: Arg-C, 7: Asp-N, 8: alphaLP, 9: no cleavage) [-protocol ProtocolID] (0: Automatic (Default), 1: Phosphorylation, 2: iTRAQ, 3: iTRAQPhospho, 4: TMT, 5: Standard) [-ntt 0/1/2] (Number of Tolerable Termini, Default: 2) E.g. For trypsin, 0: non-tryptic, 1: semi-tryptic, 2: fully-tryptic peptides only. [-mod ModificationFileName] (Modification file, Default: standard amino acids with fixed C+57) [-minLength MinPepLength] (Minimum peptide length to consider, Default: 6) [-maxLength MaxPepLength] (Maximum peptide length to consider, Default: 40) [-minCharge MinCharge] (Minimum precursor charge to consider if charges are not specified in the spectrum file, Default: 2) [-maxCharge MaxCharge] (Maximum precursor charge to consider if charges are not specified in the spectrum file, Default: 3) [-n NumMatchesPerSpec] (Number of matches per spectrum to be reported, Default: 1) [-addFeatures 0/1] (0: output basic scores only (Default), 1: output additional features) [-ccm ChargeCarrierMass] (Mass of charge carrier, Default: mass of proton (1.00727649)) Example (high-precision): java -Xmx3500M -jar MSGFPlus.jar -s test.mzXML -d IPI_human_3.79.fasta -t 20ppm -ti -1,2 -ntt 2 -tda 1 -o testMSGFPlus.mzid Example (low-precision): java -Xmx3500M -jar MSGFPlus.jar -s test.mzXML -d IPI_human_3.79.fasta -t 0.5Da,2.5Da -ntt 2 -tda 1 -o testMSGFPlus.mzid
MzIDToTsv v9108 (26 Oct 2016) Usage: java -Xmx3500M -cp MSGFPlus.jar edu.ucsd.msjava.ui.MzIDToTsv -i MzIDPath (MS-GF+ output file (.mzid) or directory containing mzid files) [-o TSVFile] (TSV output file (.tsv) (Default: MzIDFileName.tsv)) [-showQValue 0/1] (0: do not show Q-values, 1: show Q-values (Default)) [-showDecoy 0/1] (0: do not show decoy PSMs (Default), 1: show decoy PSMs) [-showFormula 0/1] (0: do not show molecular formula (Default), 1: show molecular formula of peptides) [-unroll 0/1] (0: merge shared peptides (Default), 1: unroll shared peptides)
Command error: ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss [Error] Invalid value for parameter -mod: Mods.txt (file does not exist) [Error] Invalid value for parameter -i: TCAM2.mzid (file does not exist)
Work dir: /home/weronika/proteogenomics-analysis-workflow/work/d0/19d12211d3e9d04a68785c13796d96
Tip: view the complete command output by changing to the process work dir and entering the command
cat .command.out
-- Check '.nextflow.log' file for details
My bad :( I typed too fast in the previous comment on this issue, sorry. Write --mods
instead of --mod
Still an error.. :( but I got further!
sudo ./nextflow run ipaw.nf -resume \
> --tdb VarDB.fasta \
> --mzmls TCAM2.mzML \
> --gtf VarDB.gtf \
> --knownproteins Homo_sapiens.GRCh38.pep.all.fa \
> --blastdb UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta \
> --snpfa MSCanProVar_ensemblV79.filtered.fasta \
> --genome hg19.chr1-22.X.Y.M.fa.masked \
> --dbsnp snp142CodingDbSnp.txt \
> --cosmic CosmicMutantExport.tsv \
> --mods new_mods.txt \
> --outdir results
[sudo] password for weronika:
N E X T F L O W ~ version 0.28.0
Launching `ipaw.nf` [focused_perlman] - revision: baa0e7b888
WARN: Access to undefined parameter `mzmldef` -- Initialise it to a default value eg. `params.mzmldef = some_value`
Detected setnames: NA
[warm up] executor > local
[d7/571a70] Cached process > makeTrypSeq
[02/2dcc11] Cached process > concatFasta
[9f/f2edd7] Cached process > createSpectraLookup (1)
[9e/cd97b8] Cached process > makeProtSeq
[f3/fc335a] Cached process > makeDecoyReverseDB
[23/d67cec] Submitted process > msgfPlus (1)
[1f/e6ae15] Submitted process > percolator (1)
[62/9639d0] Submitted process > filterPercolator (1)
[e6/7dda40] Submitted process > svmToTSV (1)
[69/bdc780] Submitted process > svmToTSV (2)
[75/12f030] Submitted process > createPSMPeptideTable (2)
[be/cb5990] Submitted process > createPSMPeptideTable (1)
[12/f59e4a] Submitted process > createFastaBedGFF (1)
[1e/11e55d] Submitted process > prepSpectrumAI (1)
[3a/7e0764] Submitted process > mergeSetPSMtable (1)
[d3/bf04f6] Submitted process > mergeSetPSMtable (2)
[cf/238b47] Submitted process > prePeptideTable (1)
Pipeline output ready: /home/weronika/proteogenomics-analysis-workflow/work/3a/7e0764481db9563fa878c996fb8a79/variant_psmtable.txt
Pipeline output ready: /home/weronika/proteogenomics-analysis-workflow/work/d3/bf04f6a0698b0b1cb28818c93c2292/novel_psmtable.txt
[36/1feb40] Submitted process > SpectrumAI (1)
ERROR ~ Error executing process > 'prePeptideTable (1)'
Caused by:
Process `prePeptideTable (1)` terminated with an error exit status (127)
Command executed:
null
Command exit status:
127
Command output:
(empty)
Command error:
.command.sh: line 2: null: command not found
ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss
Work dir:
/home/weronika/proteogenomics-analysis-workflow/work/cf/238b479b7ac48b71f6dc5f10d6f5dd
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
-- Check '.nextflow.log' file for details
WARN: Killing pending tasks (2)
Another bug! I pushed a fix, git pull
and try again.
By the way, thank you very much for your patience, our pipeline is getting much better letting other labs with other data test it!
No problem at all. Glad I can help and it's great that you are quick with fixes :) We're not there yet though.. Everything same as above, except for the error:
ERROR ~ Error executing process > 'prePeptideTable (1)'
Caused by:
Process `prePeptideTable (1)` terminated with an error exit status (1)
Command executed:
msspsmtable merge -o psms.txt -i psms*
msspeptable psm2pep -i psms.txt -o preisoquant --scorecolpattern svm --spectracol 1 --isobquantcolpattern plex
awk -F '\t' 'BEGIN {OFS = FS} {print $12,$13,$3,$7,$8,$9,$11,$14,$15,$16,$17,$18,$19,$20,$21,$22}' preisoquant > preordered
mv preordered peptidetable.txt
Command exit status:
1
Command output:
(empty)
Command error:
ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss
ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss
Traceback (most recent call last):
File "/usr/local/bin/msspeptable", line 6, in <module>
sys.exit(app.peptable.main())
File "/usr/local/lib/python3.6/site-packages/app/peptable.py", line 14, in main
startup.start_msstitch(drivers, sys.argv)
File "/usr/local/lib/python3.6/site-packages/app/drivers/startup.py", line 53, in start_msstitch
args.func(**vars(args))
File "/usr/local/lib/python3.6/site-packages/app/drivers/base.py", line 74, in start
self.run()
File "/usr/local/lib/python3.6/site-packages/app/drivers/pepprottable.py", line 15, in run
self.create_header()
File "/usr/local/lib/python3.6/site-packages/app/drivers/peptable/psmtopeptable.py", line 47, in create_header
self.precurquantcol)
File "/usr/local/lib/python3.6/site-packages/app/actions/headers/peptable.py", line 26, in get_psm2pep_header
isocols = tsv.get_columns_by_pattern(header, isobq_pattern)
File "/usr/local/lib/python3.6/site-packages/app/readers/tsv.py", line 148, in get_columns_by_pattern
'pattern: {}'.format(pattern))
RuntimeError: Could not find fieldname in header with pattern: plex
Work dir:
/home/weronika/proteogenomics-analysis-workflow/work/8b/6e6f872e55bd39bac8d07a5638ead2
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
-- Check '.nextflow.log' file for details
WARN: Killing pending tasks (2)
I applied another fix, pull and retry. I wonder how many times we have left.
Haha, me too. Another one:
[58/082c70] Submitted process > prePeptideTable (1)
[2e/8e7964] Submitted process > SpectrumAI (1)
[7b/e101af] Submitted process > mapVariantPeptidesToGenome (1)
[52/9acbeb] Submitted process > annovar (1)
[1d/175104] Submitted process > BlastPNovel (1)
[2c/3901db] Submitted process > phyloCSF (1)
[40/b711e8] Submitted process > BLATNovel (1)
[64/e764af] Submitted process > labelnsSNP (1)
[bf/678ee0] Submitted process > phastcons (1)
ERROR ~ Error executing process > 'BlastPNovel (1)'
Caused by:
Process `BlastPNovel (1)` terminated with an error exit status (1)
Command executed:
makeblastdb -in UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta -dbtype prot
blastp -db UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta -query novel_peptides.fa -outfmt '6 qseqid sseqid pident qlen slen qstart qend sstart send mismatch positive gapopen gaps qseq sseq evalue bitscore' -num_threads 8 -max_target_seqs 1 -evalue 1000 -out blastp_out.txt
Command exit status:
1
Command output:
Building a new DB, current time: 04/11/2018 12:57:30
New DB name: UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta
New DB title: UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 99970 sequences in 14.2546 seconds.
Command error:
Unable to find image 'quay.io/biocontainers/blast:2.7.1--boost1.64_1' locally
2.7.1--boost1.64_1: Pulling from biocontainers/blast
a3ed95caeb02: Already exists
4c1fa756c345: Already exists
a7f760de4b27: Already exists
d836c29a56fb: Already exists
6c2ebb6634fc: Already exists
00f810677cff: Already exists
531ebc5af9ff: Already exists
a3ed95caeb02: Already exists
aef3b3b2fa0d: Already exists
4cde73d2600f: Pulling fs layer
4cde73d2600f: Verifying Checksum
4cde73d2600f: Download complete
4cde73d2600f: Pull complete
Digest: sha256:2d118be6f6da0232af8420b05019c0d24b0c996c576d8a0c4b1700de1ff61b22
Status: Downloaded newer image for quay.io/biocontainers/blast:2.7.1--boost1.64_1
ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss
ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss
USAGE
blastp [-h] [-help] [-import_search_strategy filename]
[-export_search_strategy filename] [-task task_name] [-db database_name]
[-dbsize num_letters] [-gilist filename] [-seqidlist filename]
[-negative_gilist filename] [-negative_seqidlist filename]
[-entrez_query entrez_query] [-db_soft_mask filtering_algorithm]
[-db_hard_mask filtering_algorithm] [-subject subject_input_file]
[-subject_loc range] [-query input_file] [-out output_file]
[-evalue evalue] [-word_size int_value] [-gapopen open_penalty]
[-gapextend extend_penalty] [-qcov_hsp_perc float_value]
[-max_hsps int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value]
[-xdrop_gap_final float_value] [-searchsp int_value]
[-sum_stats bool_value] [-seg SEG_options] [-soft_masking soft_masking]
[-matrix matrix_name] [-threshold float_value] [-culling_limit int_value]
[-best_hit_overhang float_value] [-best_hit_score_edge float_value]
[-window_size int_value] [-lcase_masking] [-query_loc range]
[-parse_deflines] [-outfmt format] [-show_gis]
[-num_descriptions int_value] [-num_alignments int_value]
[-line_length line_length] [-html] [-max_target_seqs num_sequences]
[-num_threads int_value] [-ungapped] [-remote] [-comp_based_stats compo]
[-use_sw_tback] [-version]
DESCRIPTION
Protein-Protein BLAST 2.7.1+
Use '-help' to print detailed descriptions of command line arguments
========================================================================
Error: Argument "num_threads". Illegal value, expected (>=1 and =<4): `8'
Error: (CArgException::eConstraint) Argument "num_threads". Illegal value, expected (>=1 and =<4): `8'
Work dir:
/home/weronika/proteogenomics-analysis-workflow/work/1d/1751043c909078d5232b3fb70422de
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
-- Check '.nextflow.log' file for details
WARN: Killing pending tasks (4)
Ok, the error occurs because the specified num_threads 8 exceed the available cores (>=1 and =<4).
You can do a quick fix by changing in the ipaw.nf script, change -num_threads 8
to -num_threads 4
.
We will try to add an option in the command line later.
Unfortunately, I got another one:
[e4/95376d] Submitted process > mapVariantPeptidesToGenome (1)
[4f/503bbd] Submitted process > annovar (1)
[58/6597bd] Submitted process > BlastPNovel (1)
[9a/fb3687] Submitted process > phyloCSF (1)
[18/589540] Submitted process > phastcons (1)
[c7/184f34] Submitted process > labelnsSNP (1)
[e7/1bdf4d] Submitted process > BLATNovel (1)
[6d/abddd9] Submitted process > parseAnnovarOut (1)
[0b/16a2aa] Submitted process > ParseBlastpOut (1)
[37/70ae10] Submitted process > ValidateSingleMismatchNovpeps (1)
[af/3b6a89] Submitted process > novpepSpecAIOutParse (1)
ERROR ~ Error executing process > 'mapVariantPeptidesToGenome (1)'
Caused by:
Process `mapVariantPeptidesToGenome (1)` terminated with an error exit status (1)
Command executed:
python3 /pgpython/parse_spectrumAI_out.py --spectrumAI_out NA_variant_specairesult.txt --input peptidetable.txt --output NA_variant_peptides.txt
python3 /pgpython/map_cosmic_snp_tohg19.py --input NA_variant_peptides.txt --output NA_variant_peptides.saav.pep.hg19cor.vcf --cosmic_input CosmicMutantExport.tsv --dbsnp_input snp142CodingDbSnp.txt
Command exit status:
1
Command output:
protein id PGOHUM00000239752_ORF3(pre=K,post=N);PGOHUM_ENST00000547505.2_EEF1A1P17_ORF3(pre=K,post=N) can't be mapped
protein id PGOHUM00000237869_ORF1(pre=K,post=C);PGOHUM00000238279_ORF2(pre=K,post=C);PGOHUM_ENST00000437653.1_RP11-809F4.2_ORF1(pre=K,post=C);PGOHUM_ENST00000420887.2_RP11-259P15.3_ORF3(pre=K,post=C);lnc-TBL1XR1-16:1_ORF1(pre=K,post=C);lnc-FXR1-2:1_ORF3(pre=K,post=C) can't be mapped
protein id PGOHUM00000241423_ORF1(pre=K,post=G);PGOHUM_ENST00000394563.4_RP6-145B8.3_ORF3(pre=K,post=G) can't be mapped
protein id PGOHUM00000247287_ORF3(pre=R,post=I);PGOHUM_ENST00000559111.1_HSP90B2P_ORF2(pre=R,post=I) can't be mapped
protein id lnc-NKX2-4-5:2_ORF2(pre=-,post=R) can't be mapped
protein id PGOHUM00000250761_ORF3(pre=K,post=Q);PGOHUM_ENST00000507090.2_HSP90AB2P_ORF1(pre=K,post=Q);lnc-CPEB2-12:1_ORF3(pre=K,post=Q) can't be mapped
protein id PGOHUM00000242888_ORF1(pre=K,post=H);PGOHUM_ENST00000526512.1_RP11-382M14.1_ORF2(pre=K,post=H) can't be mapped
protein id PGOHUM00000240639_ORF3(pre=R,post=S);PGOHUM_ENST00000415103.1_AC078994.2_ORF1(pre=R,post=S) can't be mapped
protein id lnc-SLITRK1-4:1_ORF2(pre=R,post=L) can't be mapped
protein id PGOHUM00000241261_ORF1(pre=K,post=S);PGOHUM_ENST00000417985.1_ACTBP1_ORF1(pre=K,post=S) can't be mapped
protein id PGOHUM00000247755_ORF2(pre=Kped
protein id PGOHUM00000238102_ORF3(pre=R,post=H);PGOHUM_ENST00000471403.1_CDV3P1_ORF3(pre=R,post=H) can't be mapped
protein id PGOHUM00000236730_ORF3(pre=R,post=A);PGOHUM_ENST00000400056.3_KRT18P13_ORF3(pre=R,post=A) can't be mapped
protein id PGOHUM00000244156_ORF2(pre=K,post=L);PGOHUM_ENST00000419696.1_GAPDHP64_ORF2(pre=K,post=L) can't be mapped
protein id PGOHUM00000245611_ORF3(pre=K,post=K);PGOHUM_ENST00000512920.2_NPM1P41_ORF1(pre=K,post=K);lnc-HNRNPD-4:1_ORF1(pre=K,post=K) can't be mapped
protein id PGOHUM_ENST00000412323.1_ATP5A1P2_ORF1(pre=R,post=A) can't be mapped
protein id lnc-AKAP14-1:2_ORF2(pre=R,post=Q);lnc-AKAP14-1:1_ORF2(pre=R,post=Q);lnc-AKAP14-1:3_ORF1(pre=R,post=Q) can't be mapped
protein id lnc-AKAP14-1:1_ORF2(pre=R,post=G);lnc-AKAP14-1:3_ORF1(pre=R,post=G) can't be mapped
protein id lncRNA_ENST00000597346.1_ORF1(pre=K,post=Q);lncRNA_ENST00000561320.1_ORF3(pre=K,post=Q);lncRNA_ENST00000585816.1_ORF1(pre=K,post=Q);lnc-CDKN1C-3:4_ORF2(pre=K,post=Q);lnc-ABHD12-4:2_ORF3(pre=K,post=Q);lnc-CCZ1B-6:2_ORF1(pre=K,post=Q);lnc-ABHD12-4:1_ORF2(pre=K,post=Q);lnc-RGR-2:1_ORF1(pre=K,post=Q);lnc-CDKN1C-3:5_ORF1(pre=K,post=Q);lnc-ZNF682-3:4_ORF1(pre=K,post=Q);lnc-CDKN1C-3:8_ORF3(pre=K,post=Q);lnc-AADAT-9:1_ORF3(pre=K,post=Q);lnc-RASGRP1-3:2_ORF3(pre=K,post=Q);lnc-C3orf79-9:1_ORF2(pre=K,post=Q) can't be mapped
protein id PGOHUM_ENST00000378770.1_HSP90AA4P_ORF2(pre=K,post=I) can't be mapped
protein id PGOHUM00000239752_ORF3(pre=K,post=N);PGOHUM_ENST00000547505.2_EEF1A1P17_ORF3(pre=K,post=N) can't be mapped
protein id PGOHUM00000237869_ORF1(pre=K,post=C);PGOHUM00000238279_ORF2(pre=K,post=C);PGOHUM_ENST00000437653.1_RP11-809F4.2_ORF1(pre=K,post=C);PGOHUM_ENST00000420887.2_RP11-259P15.3_ORF3(pre=K,post=C);lnc-TBL1XR1-16:1_ORF1(pre=K,post=C);lnc-FXR1-2:1_ORF3(pre=K,post=C) can't be mapped
protein id PGOHUM00000241423_ORF1(pre=K,post=G);PGOHUM_ENST00000394563.4_RP6-145B8.3_ORF3(pre=K,post=G) can't be mapped
protein id PGOHUM00000247287_ORF3(pre=R,post=I);PGOHUM_ENST00000559111.1_HSP90B2P_ORF2(pre=R,post=I) can't be mapped
protein id lnc-NKX2-4-5:2_ORF2(pre=-,post=R) can't be mapped
protein id PGOHUM00000250761_ORF3(pre=K,post=Q);PGOHUM_ENST00000507090.2_HSP90AB2P_ORF1(pre=K,post=Q);lnc-CPEB2-12:1_ORF3(pre=K,post=Q) can't be mapped
protein id PGOHUM00000242888_ORF1(pre=K,post=H);PGOHUM_ENST00000526512.1_RP11-382M14.1_ORF2(pre=K,post=H) can't be mapped
protein id PGOHUM00000240639_ORF3(pre=R,post=S);PGOHUM_ENST00000415103.1_AC078994.2_ORF1(pre=R,post=S) can't be mapped
protein id lnc-SLITRK1-4:1_ORF2(pre=R,post=L) can't be mapped
protein id PGOHUM00000241261_ORF1(pre=K,post=S);PGOHUM_ENST00000417985.1_ACTBP1_ORF1(pre=K,post=S) can't be mapped
protein id PGOHUM00000247755_ORF2(pre=K,post=D);PGOHUM_ENST00000557130.1_UBE2CP1_ORF2(pre=K,post=D);lnc-STRN3-12:1_ORF2(pre=K,post=D) can't be mapped
protein id PGOHUM00000257036_ORF1(pre=K,post=N);PGOHUM_ENST00000440317.1_YWHAZP2_ORF1(pre=K,post=N) can't be mapped
protein id PGOHUM00000247098_ORF3(pre=R,post=N);PGOHUM_ENST00000569826.2_RP11-265N6.3_ORF1(pre=R,post=N) can't be mapped
protein id PGOHUM00000246771_ORF3(pre=R,post=C);PGOHUM_ENST00000418351.1_ACTBP7_ORF3(pre=R,post=C) can't be mapped
protein id PGOHUM00000259879_ORF1(pre=R,post=G);PGOHUM_ENST00000438353.1_HSP90AA5P_ORF1(pre=R,post=G) can't be mapped
protein id PGOHUM00000235293_ORF1(pre=K,post=K);PGOHUM_ENST00000515379.1_HNRNPA1P12_ORF1(pre=K,post=K) can't be mapped
protein id PGOHUM_ENST00000425843.1_HSPA8P1_ORF1(pre=R,post=S) can't be mapped
protein id PGOHUM00000233123_ORF2(pre=R,post=L);PGOHUM_ENST00000434621.2_GAPDHP68_ORF1(pre=R,post=L) can't be mapped
protein id PGOHUM00000243082_ORF1(pre=K,post=D);PGOHUM_ENST00000362070.3_HIST1H2APS4_ORF1(pre=K,post=D) can't be mapped
protein id PGOHUM00000241051_ORF1(pre=R,post=G);PGOHUM_ENST00000453073.1_HNRNPA1P47_ORF2(pre=R,post=G) can't be mapped
protein id PGOHUM00000237244_ORF1(pre=K,post=C);PGOHUM00000235394_ORF2(pre=K,post=C);PGOHUM00000241322_ORF2(pre=K,post=C);PGOHUM_ENST00000423783.1_AC055811.5_ORF3(pre=K,post=C);PGOHUM_ENST00000428275.1_ACTG1P10_ORF2(pre=K,post=C);lnc-KDM5C-3:1_ORF2(pre=K,post=C) can't be mapped
protein id PGOHUM00000238365_ORF1(pre=K,post=H);PGOHUM_ENST00000415473.1_PPIAP30_ORF1(pre=K,post=H) can't be mapped
protein id PGOHUM00000241240_ORF1(pre=K,post=T);PGOHUM_ENST00000452570.1_GAPDHP1_ORF1(pre=K,post=T) can't be mapped
protein id PGOHUM00000248221_ORF3(pre=R,post=T);PGOHUM_ENST00000557241.1_KRT18P7_ORF1(pre=R,post=T);lnc-TTC9-3:1_ORF1(pre=R,post=T) can't be mapped
protein id PGOHUM00000235394_ORF2(pre=K,post=S) can't be mapped
protein id PGOHUM00000246938_ORF1(pre=K,post=T);PGOHUM_ENST00000471472.2_RPL7P5_ORF1(pre=K,post=T) can't be mapped
protein id lnc-SENP6-10:1_ORF1(pre=R,post=G) can't be mapped
protein id PGOHUM00000248919_ORF3(pre=R,post=F);PGOHUM_ENST00000566277.2_CTD-2033A16.2_ORF3(pre=R,post=F) can't be mapped
protein id PGOHUM00000243288_ORF2(pre=R,post=C);PGOHUM_ENST00000403258.1_ACTBP8_ORF2(pre=R,post=C) can't be mapped
protein id PGOHUM00000248919_ORF3(pre=R,post=F);PGOHUM_ENST00000566277.2_CTD-2033A16.2_ORF3(pre=R,post=F) can't be mapped
Command error:
Traceback (most recent call last):
File "/pgpython/map_cosmic_snp_tohg19.py", line 105, in <module>
chr_position=cosmic_dic[cosmic_id][0]
KeyError: 'COSMIC:HSP90AA1:ENST00000334701:c.2494G>T:p.A832S'
.command.stub: line 99: 12 Terminated nxf_trace "$pid" .command.trace
Work dir:
/home/weronika/proteogenomics-analysis-workflow/work/e4/95376d80c1505ace2b027bab4168d3
Tip: when you have fixed the problem you can continue the execution appending to the nextflow command line the option `-resume`
-- Check '.nextflow.log' file for details
WARN: Killing pending tasks (1)
Can you do one check for me? I just want to make sure if you have the correct version of COSMIC file downloaded.
try the following command and paste the results here.
grep ENST00000334701 CosmicMutantExport.tsv | grep c.2494G>T
This is in the file:
HSP90AA1 ENST00000334701 2565 5253 TCGA-CM-6171-01 1651233 1566020 large_intestine colon ascending NS carcinoma adenocarcinoma NS NS y COSM1368331 c.2494G>T p.A832S Substitution - Missense u 37 14:102548120-102548120 - n NEUTRAL .36228 Confirmed somatic variant 376 NS NS 77
Here are the content in the file CosmicMutantExport.tsv stored in our server.
HSP90AA1 ENST00000334701 2565 5253 TCGA-CM-6171-01 1651233 1566020 large_intestine colon carcinoma adenocarcinoma y 1368331 c.2494G>T p.A832S Substitution - Missense het 14:102548120-102548120 - n PASSENGER/OTHER Variant of unknown origin 376 NS NS 77 Stage:I
I think you have a different version of this file. please try this step again to make sure you have COSMIC v71 downloaded.
Get the COSMIC database
sftp 'your_email_address@example.com'@sftp-cancer.sanger.ac.uk
Download the data (NB version 71 currently works with the mapping script)
sftp> get cosmic/grch37/cosmic/v71/CosmicMutantExport.tsv.gz
sftp> exit
Extract COSMIC data
tar xvfz CosmicMutantExport.tsv.gz
Connected to sftp-cancer.sanger.ac.uk.
sftp> get cosmic/grch37/cosmic/v71/CosmicMutantExport.tsv.gz
File "/cosmic/grch37/cosmic/v71/CosmicMutantExport.tsv.gz" not found.
v71 doesn't exist anymore:
sftp> cd cosmic/grch37/cosmic
sftp> ls
v72 v73 v74 v75 v76 v77 v78 v79 v80 v81 v82 v83 v84
OK, it seems they have taken down the old version. We will update our pipeline to fit the newer database format then. It should be quick to do, I will let you know when it is done.
Thanks, I will wait patiently ;)
Hi, @Weronika77
I have updated the script map_cosmic_snp_tohg19.py
that causes the error.
It fits the latest formatting of cosmic v84 now.
so try to download the cosmic file from v84 and rerun the pipeline.
@glormph the ipaw.nf script should be able to call the latest script map_cosmic_snp_tohg19.py
from github if there is new commit push, right?
I'm not sure if it should work already, but with the new git pull, I still get the same error with cosmic v84.
The pgpython container needs to be updated when the scripts get updated. I have updated the docker container files so you dont have to re-download the bigwig files. As follows:
cd dockerfiles
docker tag pgpython pgpython_bigwigs # to adhere to the new way to create containers
docker build -f pgpython_Dockerfile -t pgpython .
cd ..
Hope I havent forgotten anything.
I have just tested that script and discovered that the update will not work. We probably need to hand out the COSMIC v71 data (which needs to match the VarDB search database). Continue tomorrow!
We've updated the docker container for the COSMIC peptide mapping. The following should fix that problem:
cd dockerfiles
docker build -f pgpython_Dockerfile -t pgpython .
cd ..
It finally works! I ran it with no errors. Thank you very much for all your help and patience.
Thank you very much yourself for making the pipeline better!
I have a label free data which I am trying to analyse using the IPAW pipeline but I am getting an error which I need help for. See below the error message:
N E X T F L O W ~ version 18.10.1
Launching ipaw.nf
[gloomy_rubens] - revision: c0cfffc9a5
WARN: Access to undefined parameter pisepdb
-- Initialise it to a default value eg. params.pisepdb = some_value
WARN: Access to undefined parameter mzmldef
-- Initialise it to a default value eg. params.mzmldef = some_value
Detected setnames: NA
[warm up] executor > local
WARN: Input tuple does not match input set cardinality declared by process splitSetNormalSearchPsms
-- offending value: NA
[9e/016ff1] Submitted process > concatFasta (1)
[17/a04b64] Submitted process > makeProtSeq
[dc/a05a49] Submitted process > makeTrypSeq
[29/0a072c] Submitted process > createSpectraLookup (1)
ERROR ~ Error executing process > 'makeTrypSeq'
Caused by:
Process makeTrypSeq
terminated with an error exit status (127)
Command executed:
msslookup seqspace -i Homo_sapiens.GRCh38.pep.all.fa --insourcefrag
Command exit status: 127
Command output: (empty)
Command error: .command.sh: line 2: msslookup: command not found
Work dir: /home/javan/Desktop/proteogenomics/work/dc/a05a49b6c6f3202cd7ba13cd34943c
Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out
-- Check '.nextflow.log' file for details WARN: Killing pending tasks (3)
@javanOkendo
Command error: .command.sh: line 2: msslookup: command not found
It seems docker hasn;t successfully find container 'quay.io/biocontainers/msstitch:2.5--py36_0'
where mssloopup command is. When you installed docker, was there any error? Can you type this command docker container ls -a
to show all the containers created.
I have a label free data which I am trying to analyse using the IPAW pipeline but I am getting an error which I need help for. See below the error message:
N E X T F L O W ~ version 18.10.1
Launching ipaw.nf
[gloomy_rubens] - revision: c0cfffc9a5
WARN: Access to undefined parameter pisepdb
-- Initialise it to a default value eg. params.pisepdb = some_value
WARN: Access to undefined parameter mzmldef
-- Initialise it to a default value eg. params.mzmldef = some_value
Detected setnames: NA
[warm up] executor > local
WARN: Input tuple does not match input set cardinality declared by process splitSetNormalSearchPsms
-- offending value: NA
[9e/016ff1] Submitted process > concatFasta (1)
[17/a04b64] Submitted process > makeProtSeq
[dc/a05a49] Submitted process > makeTrypSeq
[29/0a072c] Submitted process > createSpectraLookup (1)
ERROR ~ Error executing process > 'makeTrypSeq'
Caused by:
Process makeTrypSeq
terminated with an error exit status (127)
Command executed:
msslookup seqspace -i Homo_sapiens.GRCh38.pep.all.fa --insourcefrag
Command exit status: 127
Command output: (empty)
Command error: .command.sh: line 2: msslookup: command not found
Work dir: /home/javan/Desktop/proteogenomics/work/dc/a05a49b6c6f3202cd7ba13cd34943c
Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out
-- Check '.nextflow.log' file for details WARN: Killing pending tasks (3)
Hi @yafeng these are the containers i created: docker container ls -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b7b2678f8467 hello-world "/hello" 3 days ago Exited (0) 3 days ago jovial_hypatia
docker container ls -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b7b2678f8467 hello-world "/hello" 3 days ago Exited (0) 3 days ago jovial_hypatia
On Sun, Dec 9, 2018 at 11:15 AM yafeng notifications@github.com wrote:
@javanOkendo https://github.com/javanOkendo Command error: .command.sh: line 2: msslookup: command not found
It seems docker hasn;t successfully find container ' quay.io/biocontainers/msstitch:2.5--py36_0' where mssloopup command is. When you installed docker, was there any error? Can you type this command docker container ls -a to show all the containers created.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lehtiolab/proteogenomics-analysis-workflow/issues/2#issuecomment-445522124, or mute the thread https://github.com/notifications/unsubscribe-auth/AZO-O5vNzWcXNW3pIthtwwcBPcq8sMH2ks5u3NSxgaJpZM4S7XaU .
@javanOkendo Docker has not built any container that are required to run this workflow. Did you follow our manual in the README under "Prepare once"
@yafeng I did follow all the instructions and see below my containers. See below Sending build context to Docker daemon 479.7MB Step 1/6 : FROM cpanse/protviz ---> 17fb90fc3008 Step 2/6 : COPY spectrumAI_R_requirements.R /tmp/requirements.R ---> Using cache ---> f2993b19c7d9 Step 3/6 : RUN apt-get update && apt-get install -y libnetcdf-dev ---> Using cache ---> cd661c50fe62 Step 4/6 : RUN Rscript /tmp/requirements.R ---> Using cache ---> 843609fb645c Step 5/6 : RUN git clone https://github.com/yafeng/SpectrumAI /SpectrumAI ---> Using cache ---> 9857e7bd22eb Step 6/6 : RUN cd /SpectrumAI && git pull && git reset --hard d9fc290cd76a5ec09aa17c03a380ad09cbce2387 ---> Using cache ---> e9a4b758c967 Successfully built e9a4b758c967 Successfully tagged spectrumai:latest
Sending build context to Docker daemon 479.7MB Step 1/2 : FROM perl ---> 3e590895f3b8 Step 2/2 : COPY annovar /annovar ---> Using cache ---> eae1d6322280 Successfully built eae1d6322280 Successfully tagged annovar:latest
Sending build context to Docker daemon 479.7MB Step 1/6 : FROM pgpython_bigwigs ---> d9cba8baf5d1 Step 2/6 : RUN apt-get update ---> Using cache ---> a7be4baaa108 Step 3/6 : RUN apt-get install -y python3-pip python3-dev libcurl3-dev ---> Using cache ---> cd328fe890b2 Step 4/6 : RUN pip3 install pyBigWig pysam ---> Using cache ---> c0bdcba73ca8 Step 5/6 : RUN rm -r /pgpython; git clone https://github.com/yafeng/proteogenomics_python /pgpython ---> Using cache ---> 8006cc7c8cb3 Step 6/6 : RUN cd /pgpython && git pull && git reset --hard 7c2cf3ac5d6a1f7f15dd9019438a3a4332d30c26 ---> Using cache ---> f88e3b440757 Successfully built f88e3b440757 Successfully tagged pgpython:latest
@javanOkendo try docker images
command see if the built images are there, and also check your disk space where the docker images were built. And type the docker version
@yafeng the images are okay. It runned for a few minutes and I got this error again.
N E X T F L O W ~ version 18.10.1
Launching ipaw.nf
[thirsty_mcclintock] - revision: b59e6478ab
WARN: Access to undefined parameter pisepdb
-- Initialise it to a default value eg. params.pisepdb = some_value
WARN: Access to undefined parameter mzmldef
-- Initialise it to a default value eg. params.mzmldef = some_value
Detected setnames: NA
[warm up] executor > local
WARN: Input tuple does not match input set cardinality declared by process splitSetNormalSearchPsms
-- offending value: NA
[22/acbbba] Submitted process > concatFasta (1)
[7e/b3fba1] Submitted process > makeProtSeq
[6b/0a709b] Submitted process > makeTrypSeq
[42/583eff] Submitted process > createSpectraLookup (1)
[65/ff33e2] Submitted process > makeDecoyReverseDB (1)
[a8/669766] Submitted process > msgfPlus (1)
[aa/b059b9] Submitted process > msgfPlus (3)
[9f/3217d7] Submitted process > msgfPlus (2)
[b4/b4fbc2] Submitted process > msgfPlus (4)
[4f/90d851] Submitted process > msgfPlus (5)
ERROR ~ Error executing process > 'createSpectraLookup (1)'
Caused by:
Process createSpectraLookup (1)
terminated with an error exit status (1)
Command executed:
msslookup spectra -i 170909_CH_C1_T006_FT.mzML 170909_CH_C1_T007_FT_R2.mzML 170909_CH_C1_T009_FT.mzML 170909_CH_C1_T010_FT_R2.mzML 170909_CH_C1_T011_FT_R2.mzML 170909_CH_C1_T013_FT.mzML 170909_CH_C1_T014_FT_R2.mzML 170909_CH_C1_T015_FT.mzML 170909_CH_C1_T025_FT.mzML 170909_CH_C1_T026_FT.mzML 170909_CH_C1_T027_FT_R2.mzML 170909_CH_C1_T037_FT.mzML 170909_CH_C1_T042_FT.mzML 170909_CH_C1_T049_FT.mzML 170909_CH_C1_T051_FT_R2.mzML 170909_CH_C1_T052_FT.mzML 170909_CH_C1_T053_FT.mzML 170909_CH_C1_T054_FT.mzML 170909_CH_C1_T061_FT_R2.mzML 170909_CH_C1_T062_FT_R2.mzML 170909_CH_C1_T064_FT_R2.mzML 170909_CH_C1_T069_FT_R2.mzML 170909_CH_C1_T073_FT_R2.mzML 170909_CH_C1_T075_FT.mzML 170909_CH_C1_T077_FT.mzML 170909_CH_C1_T080_FT.mzML 170909_CH_C1_T084_FT_R2.mzML --setnames NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Command exit status: 1
Command output: (empty)
Command error:
Traceback (most recent call last):
File "/usr/local/bin/msslookup", line 6, in
Usage: chown [-RhLHPcvf]... OWNER[<.|:>[GROUP]] FILE...
Change the owner and/or group of each FILE to OWNER and/or GROUP
-R Recurse
-h Affect symlinks instead of symlink targets
-L Traverse all symlinks to directories
-H Traverse symlinks on command line only
-P Don't traverse symlinks (default)
-c List changed files
-v List all files
-f Hide errors
Work dir: /home/javan/Desktop/proteogenomics/proteogenomics-analysis-workflow/work/42/583eff44b7692b18631df0f6435763
Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run
@javanOkendo can you paste your nextflow command here? I suspect the nextflow command input may be incorrect
@yafeng see my nextflow command: ./nextflow run ipaw.nf --tdb /home/javan/Desktop/proteogenomics/VarDB.fasta --mzmls /home/javan/Desktop/project_data/*.mzML --gtf /home/javan/Desktop/proteogenomics/VarDB.gtf --mods /home/javan/Desktop/proteogenomics/Mods.txt --knownproteins /home/javan/Desktop/proteogenomics/Homo_sapiens.GRCh38.pep.all.fa --blastdb /home/javan/Desktop/proteogenomics/UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta --cosmic /home/javan/Desktop/proteogenomics/CosmicMutantExport.tsv --snpfa /home/javan/Desktop/proteogenomics/MSCanProVar_ensemblV79.filtered.fasta --genome /home/javan/Desktop/proteogenomics/hg19.chr1-22.X.Y.M.fa.masked --dbsnp /home/javan/Desktop/proteogenomics/snp142CodingDbSnp.txt --outdir tmp/ -profile testing
./nextflow run ipaw.nf --tdb /home/javan/Desktop/proteogenomics/VarDB.fasta --mzmls /home/javan/Desktop/project_data/*.mzML --gtf /home/javan/Desktop/proteogenomics/VarDB.gtf --mods /home/javan/Desktop/proteogenomics/Mods.txt --knownproteins /home/javan/Desktop/proteogenomics/Homo_sapiens.GRCh38.pep.all.fa --blastdb /home/javan/Desktop/proteogenomics/UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta --cosmic /home/javan/Desktop/proteogenomics/CosmicMutantExport.tsv --snpfa /home/javan/Desktop/proteogenomics/MSCanProVar_ensemblV79.filtered.fasta --genome /home/javan/Desktop/proteogenomics/hg19.chr1-22.X.Y.M.fa.masked --dbsnp /home/javan/Desktop/proteogenomics/snp142CodingDbSnp.txt --outdir tmp/ -profile testing
@javanOkendo add the \ before *.mzML
and try again
--mzmls /home/javan/Desktop/project_data/\*.mzML
@yafeng Thanks for your timely assistance. I did change that and I got an error still. See below
./nextflow run ipaw.nf --tdb /home/javan/Desktop/proteogenomics/VarDB.fasta --mzmls /home/javan/Desktop/project_data/*.mzML --gtf /home/javan/Desktop/proteogenomics/VarDB.gtf --mods /home/javan/Desktop/proteogenomics/Mods.txt --knownproteins /home/javan/Desktop/proteogenomics/Homo_sapiens.GRCh38.pep.all.fa --blastdb /home/javan/Desktop/proteogenomics/UniProteome+Ensembl87+refseq+GENCODE24.proteins.fasta --cosmic /home/javan/Desktop/proteogenomics/CosmicMutantExport.tsv --snpfa /home/javan/Desktop/proteogenomics/MSCanProVar_ensemblV79.filtered.fasta --genome /home/javan/Desktop/proteogenomics/hg19.chr1-22.X.Y.M.fa.masked --dbsnp /home/javan/Desktop/proteogenomics/snp142CodingDbSnp.txt --outdir tmp/ -profile testing
N E X T F L O W ~ version 18.10.1
Launching ipaw.nf
[gloomy_lorenz] - revision: b59e6478ab
WARN: Access to undefined parameter pisepdb
-- Initialise it to a default value eg. params.pisepdb = some_value
WARN: Access to undefined parameter mzmldef
-- Initialise it to a default value eg. params.mzmldef = some_value
Detected setnames: NA
[warm up] executor > local
WARN: Input tuple does not match input set cardinality declared by process splitSetNormalSearchPsms
-- offending value: NA
[14/97cabf] Submitted process > concatFasta (1)
[fa/6ef8e6] Submitted process > makeTrypSeq
[d0/748882] Submitted process > makeProtSeq
[ef/41aaee] Submitted process > createSpectraLookup (1)
[c3/af29cf] Submitted process > makeDecoyReverseDB (1)
[2f/b4e21e] Submitted process > msgfPlus (1)
[58/115fb8] Submitted process > msgfPlus (8)
[d4/ba1a03] Submitted process > msgfPlus (3)
[c5/d80beb] Submitted process > msgfPlus (7)
[3b/fea674] Submitted process > msgfPlus (2)
ERROR ~ Error executing process > 'msgfPlus (7)'
Caused by:
Process msgfPlus (7)
terminated with an error exit status (247)
Command executed:
fs=du -Lk concatdb.fasta|cut -f1
msgf_plus -Xmx$(($fs8/1024))M -d concatdb.fasta -s 170909_CH_C1_T025_FT.mzML -o "170909_CH_C1_T025_FT.mzid" -thread 12 -mod Mods.txt -tda 0 -t 10.0ppm -ti -1,2 -m 0 -inst 3 -e 1 -protocol 0 -ntt 2 -minLength 7 -maxLength 50 -minCharge 2 -maxCharge 6 -n 1 -addFeatures 1
msgf_plus -Xmx3500M edu.ucsd.msjava.ui.MzIDToTsv -i "170909_CH_C1_T025_FT.mzid" -o out.mzid.tsv
rm concatdb.c
Command exit status: 247
Command output: MS-GF+ Release (v2016.10.26) (26 Oct 2016) Loading database files... Warning: Sequence database contains 334 counts of letter 'U', which does not correspond to an amino acid. Warning: Sequence database contains 306208 counts of letter 'X', which does not correspond to an amino acid. Warning: Sequence database contains 4 counts of letter 'Z', which does not correspond to an amino acid. Warning: Sequence database contains 2 counts of letter 'u', which does not correspond to an amino acid. Creating the suffix array indexed file... Size: 523264305 AlphabetSize: 28 Suffix creation: 0.00% complete. Suffix creation: 1.91% complete. Suffix creation: 3.82% complete. Suffix creation: 5.73% complete. Suffix creation: 7.64% complete. Suffix creation: 9.56% complete. Suffix creation: 11.47% complete. Suffix creation: 13.38% complete.
Command error: chown: unrecognized option '--from' BusyBox v1.22.1 (2014-05-23 01:24:27 UTC) multi-call binary.
Usage: chown [-RhLHPcvf]... OWNER[<.|:>[GROUP]] FILE...
Change the owner and/or group of each FILE to OWNER and/or GROUP
-R Recurse
-h Affect symlinks instead of symlink targets
-L Traverse all symlinks to directories
-H Traverse symlinks on command line only
-P Don't traverse symlinks (default)
-c List changed files
-v List all files
-f Hide errors
Work dir: /home/javan/Desktop/proteogenomics/proteogenomics-analysis-workflow/work/c5/d80beb606564679559113c5469f4ec
Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out
-- Check '.nextflow.log' file for details
@javanOkendo I have never seen this error before. This maybe relate to your spectra file input . Either it is corrupted or not correctly formatted. I can't think of anything now.
File "170909_CH_C1_T053_FT.mzML", line 9017 lxml.etree.XMLSyntaxError: Extra content at the end of the document, line 9017, column 33946 chown: unrecognized option '--from' BusyBox v1.22.1 (2014-05-23 01:24:27 UTC) multi-call binary.
@yafeng Thanks for your patience in responding to my questions. Could you show me how to handle this? Change the owner and/or group of each FILE to OWNER and/or GROUP
Hi,
So I followed the exact instructions in the README file to install it. However when I run this workflow, I get the following error:
So what I did next was the comment out line 82 in the ipaw.nf:
.map { it -> [it.baseName.replaceFirst(/.*fr(\d\d).*/, "\$1").toInteger(), it.baseName.replaceFirst(/.*\/(\S+)\.mzML/, "\$1"), it] }
And it did bring me further, however I still get an error.
Can you please help me with this error?