jts / nanocorrect

Experimental pipeline for correcting nanopore reads
MIT License
39 stars 10 forks source link

correct all reads? #2

Closed macmanes closed 9 years ago

macmanes commented 9 years ago

In your example, you include the functionality to correct a subset of the nanopore reads, eg., python nanocorrect.py nc 1000:1020 > corrected.fasta, but how would one go about correcting the entire datset? Seems like this is the more common situation. My fasta headers are formatted like this:

 grep '>' lambda.pp.fasta | head
 >m000000_000000_00000_c1898213712391273/0/0_10108 all/macmanes_lab_pc_lambda_burnin_2_1142_1_ch478_file18_strand.fast5

and

grep '>' lambda.pp.fasta | tail
>m000000_000000_00000_c1898213712391273/1130/0_11518 all/macmanes_lab_pc_lambda_burnin_2_1142_1_ch482_file19_strand.fast5
macmanes commented 9 years ago

and in the original file the headers are like ch496_file27_twodirections all/macmanes_lab_pc_lambda_burnin_2_1142_1_ch496_file27_strand.fast5 so no obvious index number to provide there either.

jts commented 9 years ago

This can now be done by providing "all" as the input range:

python nanocorrect.py nc all