bigbio / nf-workflows

Repository of Nextflow+BioContainers workflows
GNU General Public License v2.0
14 stars 8 forks source link

[Error] Invalid value for parameter -d: user.fasta (file does not exist) #28

Closed Jokendo-collab closed 4 years ago

Jokendo-collab commented 4 years ago

How do I fix this error? I am running the pipeline on the HPC compute environment and all the input files are okay.

ERROR ~ Error executing process > 'createMsgfDbIndex'

Caused by: Missing output file(s) user.revCat* expected by process createMsgfDbIndex

Command executed:

touch /tmp/test.mgf java -jar /home/biodocker/bin/MSGFPlus_9949/MSGFPlus.jar -s /tmp/test.mgf -d user.fasta -tda 1

Command exit status: 0

Command output:

MS-GF+ Beta (v9979) (3/26/2014) Usage: java -Xmx3500M -jar MSGFPlus.jar -s SpectrumFile (.mzML, .mzXML, .mgf, .ms2, .pkl or _dta.txt) -d DatabaseFile (.fasta or .fa) [-o OutputFile (.mzid)] (Default: [SpectrumFileName].mzid) [-t PrecursorMassTolerance] (e.g. 2.5Da, 20ppm or 0.5Da,2.5Da, Default: 20ppm) Use comma to set asymmetric values. E.g. "-t 0.5Da,2.5Da" will set 0.5Da to the minus (expMass<theoMass) and 2.5Da to plus (expMass>theoMass) [-ti IsotopeErrorRange] (Range of allowed isotope peak errors, Default:0,1) Takes into account of the error introduced by chooosing a non-monoisotopic peak for fragmentation. The combination of -t and -ti determins the precursor mass tolerance. E.g. "-t 20ppm -ti -1,2" tests abs(exp-calc-n1.00335Da)<20ppm for n=-1, 0, 1, 2. [-thread NumThreads] (Number of concurrent threads to be executed, Default: Number of available cores) [-tda 0/1] (0: don't search decoy database (Default), 1: search decoy database) [-m FragmentMethodID] (0: As written in the spectrum or CID if no info (Default), 1: CID, 2: ETD, 3: HCD) [-inst MS2DetectorID] (0: Low-res LCQ/LTQ (Default), 1: Orbitrap/FTICR, 2: TOF, 3: Q-Exactive) [-e EnzymeID] (0: unspecific cleavage, 1: Trypsin (Default), 2: Chymotrypsin, 3: Lys-C, 4: Lys-N, 5: glutamyl endopeptidase, 6: Arg-C, 7: Asp-N, 8: alp haLP, 9: no cleavage) [-protocol ProtocolID] (0: Automatic (Default), 1: Phosphorylation, 2: iTRAQ, 3: iTRAQPhospho, 4: TMT, 5: Standard) [-ntt 0/1/2] (Number of Tolerable Termini, Default: 2) E.g. For trypsin, 0: non-tryptic, 1: semi-tryptic, 2: fully-tryptic peptides only. [-mod ModificationFileName] (Modification file, Default: standard amino acids with fixed C+57) [-minLength MinPepLength] (Minimum peptide length to consider, Default: 6) [-maxLength MaxPepLength] (Maximum peptide length to consider, Default: 40) [-minCharge MinCharge] (Minimum precursor charge to consider if charges are not specified in the spectrum file, Default: 2) [-maxCharge MaxCharge] (Maximum precursor charge to consider if charges are not specified in the spectrum file, Default: 3) [-n NumMatchesPerSpec] (Number of matches per spectrum to be reported, Default: 1) [-addFeatures 0/1] (0: output basic scores only (Default), 1: output additional features) Example (high-precision): java -Xmx3500M -jar MSGFPlus.jar -s test.mzXML -d IPI_human_3.79.fasta -t 20ppm -ti -1,2 -ntt 2 -tda 1 -o testMSGFPlus.mzid Example (low-precision): java -Xmx3500M -jar MSGFPlus.jar -s test.mzXML -d IPI_human_3.79.fasta -t 0.5Da,2.5Da -ntt 2 -tda 1 -o testMSGFPlus.mzid

Command error: [Error] Invalid value for parameter -d: user.fasta (file does not exist)

Work dir: /scratch/oknjav001/spectral_clustering/nf-workflows/lfq-clustering/work/b4/89b9cb0d84d120f2b7bc6d1c89fe3e

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

-- Check '.nextflow.log' file for details

executor > local (3) [44/51c197] process > createTandemConfig [100%] 1 of 1 ✔ [b4/89b9cb] process > createMsgfDbIndex [100%] 1 of 1, failed: 1 ✘ [88/301cea] process > createDecoyDb [100%] 1 of 1, failed: 1 Pulling Singularity image docker://biocontainers/spectra-cluster-cli:vv1.1.2_cv2 [cache /scratch/oknjav001/spectral_clustering/nf-workflows/lfq-clustering/work /singularity/biocontainers-spectra-cluster-cli-vv1.1.2_cv2.img] WARN: Killing pending tasks (1) ERROR ~ Error executing process > 'createMsgfDbIndex'

jgriss commented 4 years ago

Hi @javanOkendo,

Can you please post your nextflow run command? It seems you are missing the required FASTA file.

Jokendo-collab commented 4 years ago

Hi @jgriss

I am now getting a different error with the following command: nextflow run -resume main.nf --prec_rol 20 --frag_tol 0.5 \ --mc 2 --min_ident 2 --min_ratio 0.7 --raw_dir /scratch/oknjav001/mascot_genericRawdata/rawdata \ --fasta_file /scratch/oknjav001/spectral_clustering/nf-workflows/lfq-clustering/test/2020-02-13-reviewed-contam-UP000005640.fas

Jokendo-collab commented 4 years ago

@jgriss The above command gave me this error even after ensuring that the path to the data is well correct. mgf files not found error. N E X T F L O W ~ version 19.04.1 Launching main.nf [marvelous_woese] - revision: e3fa8abc59 [warm up] executor > local WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /scratch/oknjav001/spectral_clustering/nf-workflows/lfq-clus tering/work/singularity Pulling Singularity image docker://biocontainers/searchgui:v2.8.6_cv2 [cache /scratch/oknjav001/spectral_clustering/nf-workflows/lfq-clustering/work/singularit y/biocontainers-searchgui-v2.8.6_cv2.img] Pulling Singularity image docker://biocontainers/msgfp:v9949_cv3 [cache /scratch/oknjav001/spectral_clustering/nf-workflows/lfq-clustering/work/singularity/bio containers-msgfp-v9949_cv3.img] Pulling Singularity image docker://biocontainers/spectra-cluster-cli:vv1.1.2_cv2 [cache /scratch/oknjav001/spectral_clustering/nf-workflows/lfq-clustering/work /singularity/biocontainers-spectra-cluster-cli-vv1.1.2_cv2.img] executor > local (1) [44/51c197] process > createTandemConfig [ 0%] 0 of 1

executor > local (1) [44/51c197] process > createTandemConfig [100%] 1 of 1 ✔

executor > local (4) [44/51c197] process > createTandemConfig [100%] 1 of 1 ✔ [57/7fda60] process > createMsgfDbIndex [ 0%] 0 of 1 [c4/2a6f9f] process > runClustering [ 0%] 0 of 2

executor > local (5) [44/51c197] process > createTandemConfig [100%] 1 of 1 ✔ [57/7fda60] process > createMsgfDbIndex [ 0%] 0 of 1 [c4/2a6f9f] process > runClustering [ 0%] 0 of 2 [4c/2683d1] process > createDecoyDb [ 0%] 0 of 1

executor > local (5) [44/51c197] process > createTandemConfig [100%] 1 of 1 ✔ [57/7fda60] process > createMsgfDbIndex [ 0%] 0 of 1 [c4/2a6f9f] process > runClustering [ 0%] 0 of 2 [4c/2683d1] process > createDecoyDb [ 0%] 0 of 1 ERROR ~ Error executing process > 'runClustering (1)'

Caused by: Process runClustering (1) terminated with an error exit status (1) Command executed:

if [ ls -1 *.mgf | grep -c ".xt.mgf" -gt 0 ]; then ENGINE="xtandem" else ENGINE="msgf" fi

spectra-cluster-cli -major_peak_jobs 1 -threshold_start 1 -threshold_end 0.99 -rounds 5 -precursor_tolerance 20 -precursor_tolerance_unit ppm --fragment_tolerance 0.5 -filter mz_150 -output_path ${ENGINE}.clustering *.mgf

Command exit status: 1

Command output: spectra-cluster API Version 1.0.11 Created by Rui Wang & Johannes Griss

-- Settings -- Number of threads: 1 Thresholds: 1.0 - 0.99 in 5 rounds Keeping binary files: false Binary file directory: /tmp/spectra_cluster_cli8824954918383120370 Result file: msgf.clustering Reuse binary files: false Input files: 1 Using fast mode: no

Other settings: Precursor tolerance: 20.0 ppm Fragment ion tolerance: 0.5 Loading filter: Filtering top N peaks per spectrum Added filters: mz_150 Minimum number of comparisons: adaptive CConverting 1 input files...EError: *.mgf (No such file or directory)

Command error: WARNING: destination /scratch/oknjav001/spectral_clustering/nf-workflows/lfq-clustering/work/c4/2a6f9f8c951f5cf6fa69e6dd1aa704 already in mount list: destina tion is already in the mount point list ls: cannot access '.mgf': No such file or directory /home/biodocker/bin/spectra-cluster-cli: line 1: -e: command not found java.io.FileNotFoundException: .mgf (No such file or directory) at java.io.RandomAccessFile.open0(Native Method) at java.io.RandomAccessFile.open(RandomAccessFile.java:316) at java.io.RandomAccessFile.(RandomAccessFile.java:243) at java.io.RandomAccessFile.(RandomAccessFile.java:124) at uk.ac.ebi.pride.tools.braf.BufferedRandomAccessFile.(BufferedRandomAccessFile.java:45) at uk.ac.ebi.pride.spectracluster.spectra_list.ParsingMgfScanner.parseMgfFile(ParsingMgfScanner.java:63) at uk.ac.ebi.pride.spectracluster.spectra_list.ParsingMgfScanner.getSpectrumReferences(ParsingMgfScanner.java:41) at uk.ac.ebi.pride.spectracluster.binning.BinningSpectrumConverter.processPeaklistFiles(BinningSpectrumConverter.java:80) at uk.ac.ebi.pride.spectracluster.implementation.SpectraClusterStandalone.convertInputFiles(SpectraClusterStandalone.java:418) at uk.ac.ebi.pride.spectracluster.implementation.SpectraClusterStandalone.clusterPeaklistFiles(SpectraClusterStandalone.java:101) at uk.ac.ebi.pride.spectracluster.cli.SpectraClusterCliMain.run(SpectraClusterCliMain.java:344) at uk.ac.ebi.pride.spectracluster.cli.SpectraClusterCliMain.main(SpectraClusterCliMain.java:43)

Work dir: /scratch/oknjav001/spectral_clustering/nf-workflows/lfq-clustering/work/c4/2a6f9f8c951f5cf6fa69e6dd1aa704

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

-- Check '.nextflow.log' file for details

executor > local (5) [44/51c197] process > createTandemConfig [100%] 1 of 1 ✔ [57/7fda60] process > createMsgfDbIndex [100%] 1 of 1, failed: 1 [c4/2a6f9f] process > runClustering [100%] 2 of 2, failed: 2 ✘ [4c/2683d1] process > createDecoyDb [100%] 1 of 1, failed: 1 WARN: Killing pending tasks (2) ERROR ~ Error executing process > 'runClustering (1)'

Caused by: Process runClustering (1) terminated with an error exit status (1)

Command executed:

if [ ls -1 *.mgf | grep -c ".xt.mgf" -gt 0 ]; then ENGINE="xtandem" else ENGINE="msgf" fi

spectra-cluster-cli -major_peak_jobs 1 -threshold_start 1 -threshold_end 0.99 -rounds 5 -precursor_tolerance 20 -precursor_tolerance_unit ppm --fragment_tolerance 0.5 -filter mz_150 -output_path ${ENGINE}.clustering *.mgf

Command exit status: 1

jgriss commented 4 years ago

Hi @javanOkendo,

It seems that the whole pipeline could not find any MGF files in the location you specified. Can you confirm that the directory /scratch/oknjav001/mascot_genericRawdata/rawdata actually contains MGF files. They are expected to end with ".mgf" which is case-sensitive.

Jokendo-collab commented 4 years ago

@jgriss this the directory and the total datasize is 3.0G image

Jokendo-collab commented 4 years ago

@jgriss I did configure this to run on the HPC. Can that be the reason why the program can not "see" the mgf directory?

jgriss commented 4 years ago

Hi @javanOkendo,

I do not know how your HPC is set up. I suggest you create a nextflow workflow that simply lists the files it can see to make sure they are accessible to the job.

Based on your logfile, none of the MGF file dependent jobs was run because no MGF files were found.

Since this is not really a bug we can fix on our side, I will close this issue for now.