mourisl / Rcorrector

Error correction for Illumina RNA-seq reads
GNU General Public License v3.0
63 stars 18 forks source link

Unknown argument: tmp_4f2df974fb738950aa484018d9409e8f.jf_dump #30

Closed chenyanniii closed 3 years ago

chenyanniii commented 3 years ago

Hi, there,

I try to run Rcorrector, but it stop by -c tmp*.jf_dump parameter. Is there anyway part I need to adjust of my system?

I post a example of my code and the error messages below:

login-20-25:/ResearchProject/Transcriptome/MiSeq_2020/fastp_reads$ perl /home/yannchen/rcorrector/run_rcorrector.pl -r BOGRSW1_R1.fastq.gz -p BOGRSW1_R2.fastq.gz Put the kmers into bloom filter /home/yannchen/rcorrector/jellyfish/bin/jellyfish bc -m 23 -s 100000000 -C -t 1 -o tmp_4f2df974fb738950aa484018d9409e8f.bc <(gzip -cd BOGRSW1_R1.fastq.gz) <(gzip -cd BOGRSW1_R2.fastq.gz) Count the kmers in the bloom filter /home/yannchen/rcorrector/jellyfish/bin/jellyfish count -m 23 -s 100000 -C -t 1 --bc tmp_4f2df974fb738950aa484018d9409e8f.bc -o tmp_4f2df974fb738950aa484018d9409e8f.mer_counts <(gzip -cd BOGRSW1_R1.fastq.gz) <(gzip -cd BOGRSW1_R2.fastq.gz) Dump the kmers /home/yannchen/rcorrector/jellyfish/bin/jellyfish dump -L 2 tmp_4f2df974fb738950aa484018d9409e8f.mer_counts > tmp_4f2df974fb738950aa484018d9409e8f.jf_dump Error correction /home/yannchen/rcorrector/rcorrector -r BOGRSW1_R1.fastq.gz -p BOGRSW1_R2.fastq.gz -c tmp_4f2df974fb738950aa484018d9409e8f.jf_dump Unknown argument: tmp_4f2df974fb738950aa484018d9409e8f.jf_dump

mourisl commented 3 years ago

The command should be: perl /home/yannchen/rcorrector/run_rcorrector.pl -1 BOGRSW1_R1.fastq.gz -2 BOGRSW1_R2.fastq.gz for paired-end data.

chenyanniii commented 3 years ago

Thank you! It work now! By the way, could you see whether my Rcorrector is properly installed? It showed the [OPTIONS]: '-r' and '-p'.

(base) [yannchen@sphagnum ~ ]$ rcorrector Usage: ./rcorrector [OPTIONS] OPTIONS: Required parameters: -r seq_file: seq_file is the path to the sequence file. Can use multiple -r to specifiy multiple sequence files -p seq_file_left seq_file_right: the paths to the paired-end data set. Can use multiple -p to specifiy multiple sequence files -i seq_file: seq_file is the path to the interleaved mate-pair sequence file. Can use multiple -i -c jf_dump: the kmer counts dumped by JellyFish -k kmer_length Other parameters: -od output_file_directory (default: ./) -t number of threads to use (default: 1) -maxcor INT: the maximum number of correction every 100bp (default: 8) -maxcorK INT: the maximum number of correction within k-bp window (default: 4) -wk FLOAT: the proportion of kmers that are used to estimate weak kmer count threshold (default: 0.95) -stdout: output the corrected sequences to stdout (default: not used) -verbose: output some correction information to stdout (default: not used)

mourisl commented 3 years ago

This usage information is from the core program "rcorrector". Usually, we run Rcorrector through the wrapper "run_rcorrector.pl" (though the -r, -p parameters in implicitly supported).

"-r" is for single-end read and "-p" is for paired end data. If you use "-p" parameters, it should be "-p read_1.fq read_2.fq".