Closed 123chenshixin closed 3 years ago
Hi,
The commands you are using are for an old version of nanopolish. Please see the instructions for the latest versions here:
https://nanopolish.readthedocs.io/en/latest/quickstart_consensus.html
Jared
On Dec 26, 2020, at 10:35 PM, 123chenshixin notifications@github.com wrote:
Hi, I'm using nanopolish variants to polish the draft genome which is the assbemly result by using canu.I use the Illumina sequencing data to polish my draft genome which is named JYF80.contigs.fasta.Since I am the first time using nanopolish, I use the command you have given on the Internet.The command are as follows:
!/bin/bash
for a in JYF80 do draf_fa=/home/cxs3_z4/csx/20201209/canu/${a}/${a}.contigs.fasta read_1_fa=/home/cxs3_z4/cff/Illumina/QC/${a}_R1.fq.gz read_2_fa=/home/cxs3_z4/cff/Illumina/QC/${a}_R2.fq.gz genome=./${a}.contigs.fasta
ln -s ${draf_fa} ./
Index the draft genome
bwa index ${genome}
Align the basecalled reads to the draft sequence
bwa mem -x ont2d -t 8 ${genome} ${read_1_fa} ${read_2_fa} | samtools sort -o reads.sorted.bam -T reads.tmp - samtools index reads.sorted.bam
python3 /home/cxs3_z4/software/nanopolish/scripts/nanopolish_makerange.py ${genome} | parallel --results nanopolish.results -P 8 nanopolish variants --consensus polished.{1}.fa -w {1} -r ${read_1_fa} ${read_2_fa} -b reads.sorted.bam -g ${genome} -t 4 --min-candidate-frequency 0.1 done
Then,the reads.sorted.bam and reads.sorted.bam.bai files are generated.But it tells me that "variants: too many arguments" and print many times of the informations which are the same results by using the command "nanopolish variants -h". It is obvious that the nanopolish variants has problems but I try many methods and they didn't work.My reference genome is 12M which is a kind of yeast.I don't know wheather it is too small for "-P 8".
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
I'm soryy that it doesn't work.The problem is the same.I even try to manual input the command. minimap2 -ax map-ont -t 8 ${genome} ${read_1_fa} ${read_2_fa} | samtools sort -o reads.sorted.bam -T reads.tmp samtools index reads.sorted.bam
nanopolish variants --consensus -o polished.vcf \ -w "tig00000001:200000-202000" \ -r ${read_1_fa} ${read_2_fa} \ -b reads.sorted.bam \ -g ${genome}
Can you paste the full error message you received?
On Sat, Dec 26, 2020 at 11:42 PM 123chenshixin notifications@github.com wrote:
I'm soryy that it doesn't work.The problem is the same.I even try to manual input the command. minimap2 -ax map-ont -t 8 ${genome} ${read_1_fa} ${read_2_fa} | samtools sort -o reads.sorted.bam -T reads.tmp samtools index reads.sorted.bam
nanopolish variants --consensus -o polished.vcf -w "tig00000001:200000-202000" -r ${read_1_fa} ${read_2_fa} -b reads.sorted.bam -g ${genome}
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jts/nanopolish/issues/870#issuecomment-751425693, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC7DHZULQONUCS23VHSRFLSW23MHANCNFSM4VKOOEOA .
The entire output is follows:
[M::mm_idx_gen::0.4881.01] collected minimizers [M::mm_idx_gen::0.6302.01] sorted minimizers [M::main::0.6302.01] loaded/built the index for 20 target sequence(s) [M::mm_mapopt_update::0.7181.88] mid_occ = 31 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 20 [M::mm_idx_stat::0.7741.82] distinct minimizers: 2077969 (95.40% are singletons); average occurrences: 1.116; average spacing: 5.337 [M::worker_pipeline::74.2446.41] mapped 3370249 sequences [M::worker_pipeline::91.8215.26] mapped 2497543 sequences [M::worker_pipeline::163.9785.81] mapped 3370249 sequences [M::worker_pipeline::181.520*5.29] mapped 2497543 sequences [M::main] Version: 2.17-r941 [M::main] CMD: minimap2 -ax map-ont -t 8 ./JYF80.contigs.fasta /home/cxs3_z4/cff/Illumina/QC/JYF80_R1.fq.gz /home/cxs3_z4/cff/Illumina/QC/JYF80_R2.fq.gz [M::main] Real time: 182.086 sec; CPU: 961.384 sec; Peak RSS: 3.813 GB [bam_sort_core] merging from 6 files and 1 in-memory blocks... variants: too many arguments
Usage: nanopolish variants [OPTIONS] --reads reads.fa --bam alignments.bam --genome genome.fa Find SNPs using a signal-level HMM
-v, --verbose display verbose output
--version display version
--help display this help and exit
--snps only call SNPs
--consensus run in consensus calling mode
--fix-homopolymers run the experimental homopolymer caller
--faster minimize compute time while slightly reducing consensus accuracy
-w, --window=STR find variants in window STR (format:
Report bugs to https://github.com/jts/nanopolish/issues
Oh, I see the problem now. You are trying to use nanopolish with paired end illumina data. Nanopolish only supports nanopore data.
Jared
On Dec 27, 2020, at 12:15 AM, 123chenshixin notifications@github.com wrote:
The entire output is follows:
[M::mm_idx_gen::0.4881.01] collected minimizers [M::mm_idx_gen::0.6302.01] sorted minimizers [M::main::0.6302.01] loaded/built the index for 20 target sequence(s) [M::mm_mapopt_update::0.7181.88] mid_occ = 31 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 20 [M::mm_idx_stat::0.7741.82] distinct minimizers: 2077969 (95.40% are singletons); average occurrences: 1.116; average spacing: 5.337 [M::worker_pipeline::74.2446.41] mapped 3370249 sequences [M::worker_pipeline::91.8215.26] mapped 2497543 sequences [M::worker_pipeline::163.9785.81] mapped 3370249 sequences [M::worker_pipeline::181.520*5.29] mapped 2497543 sequences [M::main] Version: 2.17-r941 [M::main] CMD: minimap2 -ax map-ont -t 8 ./JYF80.contigs.fasta /home/cxs3_z4/cff/Illumina/QC/JYF80_R1.fq.gz /home/cxs3_z4/cff/Illumina/QC/JYF80_R2.fq.gz [M::main] Real time: 182.086 sec; CPU: 961.384 sec; Peak RSS: 3.813 GB [bam_sort_core] merging from 6 files and 1 in-memory blocks... variants: too many arguments
Usage: nanopolish variants [OPTIONS] --reads reads.fa --bam alignments.bam --genome genome.fa Find SNPs using a signal-level HMM
-v, --verbose display verbose output --version display version --help display this help and exit --snps only call SNPs --consensus run in consensus calling mode --fix-homopolymers run the experimental homopolymer caller --faster minimize compute time while slightly reducing consensus accuracy -w, --window=STR find variants in window STR (format:
:-) -r, --reads=FILE the ONT reads are in fasta FILE -b, --bam=FILE the reads aligned to the reference genome are in bam FILE -e, --event-bam=FILE the events aligned to the reference genome are in bam FILE -g, --genome=FILE the reference genome is in FILE -p, --ploidy=NUM the ploidy level of the sequenced genome -q --methylation-aware=STR turn on methylation aware polishing and test motifs given in STR (example: -q dcm,dam) --genotype=FILE call genotypes for the variants in the vcf FILE -o, --outfile=FILE write result to FILE [default: stdout] -t, --threads=NUM use NUM threads (default: 1) -m, --min-candidate-frequency=F extract candidate variants from the aligned reads when the variant frequency is at least F (default 0.2) -i, --indel-bias=F bias HMM transition parameters to favor insertions (F<1) or deletions (F>1). this value is automatically set depending on --consensus, but can be manually set if spurious indels are called -d, --min-candidate-depth=D extract candidate variants from the aligned reads when the depth is at least D (default: 20) -x, --max-haplotypes=N consider at most N haplotype combinations (default: 1000) --min-flanking-sequence=N distance from alignment end to calculate variants (default: 30) --max-rounds=N perform N rounds of consensus sequence improvement (default: 50) -c, --candidates=VCF read variant candidates from VCF, rather than discovering them from aligned reads --read-group=RG only use alignments with read group tag RG -a, --alternative-basecalls-bam=FILE if an alternative basecaller was used that does not output event annotations then use basecalled sequences from FILE. The signal-level events will still be taken from the -b bam. --calculate-all-support when making a call, also calculate the support of the 3 other possible bases --models-fofn=FILE read alternative k-mer models from FILE Report bugs to https://github.com/jts/nanopolish/issues
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
But when I use nanopore sequences to polish my draft genome, it semms that it must require the fast5 file which is used to build the index files for nanopore sequences. However, I don't have the fast5 file for some reasons.Wheather I can run nanopolish without it? To confirm the error,the results of running the command from the website you have given are as follows:
Error: no fast5 files found [M::mm_idx_gen::0.5780.99] collected minimizers [M::mm_idx_gen::0.7831.71] sorted minimizers [M::main::0.7831.71] loaded/built the index for 20 target sequence(s) [M::mm_mapopt_update::0.8681.64] mid_occ = 31 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 20 [M::mm_idx_stat::0.9201.60] distinct minimizers: 2079913 (95.37% are singletons); average occurrences: 1.117; average spacing: 5.331 [M::worker_pipeline::81.4797.71] mapped 61731 sequences [M::worker_pipeline::126.2886.75] mapped 56266 sequences [M::worker_pipeline::128.1126.66] mapped 21087 sequences [M::main] Version: 2.17-r941 [M::main] CMD: minimap2 -ax map-ont -t 8 ./JYF80.racon3.fasta /home/cxs3_z4/yeast/nanopore/ONT_JYF80.fastq [M::main] Real time: 128.209 sec; CPU: 853.525 sec; Peak RSS: 5.408 GB [bam_sort_core] merging from 2 files and 1 in-memory blocks... [fai_load] build FASTA index. error: could not load the index files for input file /home/cxs3_z4/yeast/nanopore/ONT_JYF80.fastq Please run nanopolish index on your reads (see documentation) [vcf2fasta] rewrote contig tig00000001 with 0 subs, 0 ins, 0 dels (0 skipped) [vcf2fasta] rewrote contig tig00000002 with 0 subs, 0 ins, 0 dels (0 skipped) .........
Sorry but you need the fast5s for nanopolish.
Jared
On Sun, Dec 27, 2020 at 9:40 PM 123chenshixin notifications@github.com wrote:
But when I use nanopore sequences to polish my draft genome, it semms that it must require the fast5 file which is used to build the index files for nanopore sequences. However, I don't have the fast5 file for some reasons.Wheather I can run nanopolish without it? To confirm the error,the results of running the command from the website you have given are as follows:
Error: no fast5 files found [M::mm_idx_gen::0.578 0.99] collected minimizers [M::mm_idx_gen::0.7831.71] sorted minimizers [M::main::0.783 1.71] loaded/built the index for 20 target sequence(s) [M::mm_mapopt_update::0.8681.64] mid_occ = 31 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 20 [M::mm_idx_stat::0.920 1.60] distinct minimizers: 2079913 (95.37% are singletons); average occurrences: 1.117; average spacing: 5.331 [M::worker_pipeline::81.4797.71] mapped 61731 sequences [M::worker_pipeline::126.288 6.75] mapped 56266 sequences [M::worker_pipeline::128.1126.66] mapped 21087 sequences [M::main] Version: 2.17-r941 [M::main] CMD: minimap2 -ax map-ont -t 8 ./JYF80.racon3.fasta /home/cxs3_z4/yeast/nanopore/ONT_JYF80.fastq [M::main] Real time: 128.209 sec; CPU: 853.525 sec; Peak RSS: 5.408 GB [bam_sort_core] merging from 2 files and 1 in-memory blocks... [fai_load] build FASTA index. error: could not load the index files for input file /home/cxs3_z4/yeast/nanopore/ONT_JYF80.fastq Please run nanopolish index on your reads (see documentation) [vcf2fasta] rewrote contig tig00000001 with 0 subs, 0 ins, 0 dels (0 skipped) [vcf2fasta] rewrote contig tig00000002 with 0 subs, 0 ins, 0 dels (0 skipped) .........
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jts/nanopolish/issues/870#issuecomment-751554078, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC7DH6NVEP44ZOAUKN5TP3SW7VYXANCNFSM4VKOOEOA .
OK. I will try in another way. Thanks for repplying my questions!
Hi, I'm using nanopolish variants to polish the draft genome which is the assbemly result by using canu.I use the Illumina sequencing data to polish my draft genome which is named JYF80.contigs.fasta.Since I am the first time using nanopolish, I use the command you have given on the Internet.The command are as follows:
!/bin/bash
for a in JYF80 do draf_fa=/home/cxs3_z4/csx/20201209/canu/${a}/${a}.contigs.fasta read_1_fa=/home/cxs3_z4/cff/Illumina/QC/${a}_R1.fq.gz read_2_fa=/home/cxs3_z4/cff/Illumina/QC/${a}_R2.fq.gz genome=./${a}.contigs.fasta
ln -s ${draf_fa} ./
Index the draft genome
bwa index ${genome}
Align the basecalled reads to the draft sequence
bwa mem -x ont2d -t 8 ${genome} ${read_1_fa} ${read_2_fa} | samtools sort -o reads.sorted.bam -T reads.tmp - samtools index reads.sorted.bam
python3 /home/cxs3_z4/software/nanopolish/scripts/nanopolish_makerange.py ${genome} | parallel --results nanopolish.results -P 8 nanopolish variants --consensus polished.{1}.fa -w {1} -r ${read_1_fa} ${read_2_fa} -b reads.sorted.bam -g ${genome} -t 4 --min-candidate-frequency 0.1 done
Then,the reads.sorted.bam and reads.sorted.bam.bai files are generated.But it tells me that "variants: too many arguments" and print many times of the informations which are the same results by using the command "nanopolish variants -h". It is obvious that the nanopolish variants has problems but I try many methods and they didn't work.My reference genome is 12M which is a kind of yeast.I don't know wheather it is too small for "-P 8".