Hi, I managed to run the test run of NanoCLUST and got one classification result in the output file. Is this expected? Now I'm tryint to run it on my data and so far it looks the same. I noticed that the process read_correction is using canu to correct reads for a subset of the original file:
From main.nf:
Line 325: head -n\$(( $count*4 )) $reads > subset.fastq
Line 326: canu -correct -p corrected_reads -nanopore-raw subset.fastq genomeSize=${params.avg_amplicon_size} stopOnLowCoverage=1 minInputCoverage=2 minReadLength=500 minOverlapLength=200
And corrected_reads.corrected_reads.fastq contains about 50 sequences. Is it supposed to be like that, and of yes, why is only a subset of original reads used?
Hi, I managed to run the test run of NanoCLUST and got one classification result in the output file. Is this expected? Now I'm tryint to run it on my data and so far it looks the same. I noticed that the process read_correction is using canu to correct reads for a subset of the original file:
From main.nf: Line 325: head -n\$(( $count*4 )) $reads > subset.fastq Line 326: canu -correct -p corrected_reads -nanopore-raw subset.fastq genomeSize=${params.avg_amplicon_size} stopOnLowCoverage=1 minInputCoverage=2 minReadLength=500 minOverlapLength=200
And corrected_reads.corrected_reads.fastq contains about 50 sequences. Is it supposed to be like that, and of yes, why is only a subset of original reads used?