replikation / poreCov

SARS-CoV-2 workflow for nanopore sequence data
https://case-group.github.io/
GNU General Public License v3.0
39 stars 16 forks source link

not enough reads in all samples; could not generate any genomes #166

Closed omarkr8 closed 2 years ago

omarkr8 commented 2 years ago

trying to test samples processed using V1200 rapid barcoding kit.

used this command: nextflow run replikation/poreCov --fastq_pass fastq_folder/ \ --primerV V1200 --rapid TRUE --output results -profile local,docker -r 0.11.0

keep getting : Not enough reads in all samples, please investigate results/1.Read_quality Could not generate any genomes, please check your reads results/1.Read_quality [d1/73d2bc] NOTE: Process collect_fastq_wf:collect_fastq (1) terminated with an error exit status (1) -- Error is ignored

the fastq definitely do have reads. looking at them in geneious gives me >20k reads. the fastq i use as input here are 1 file per sample (5 samples tested). each file was merged using cat function from basecalled fastq.

replikation commented 2 years ago

if you have one fastq file per sample i suggest using --fastq e.g. --fastq "*.fastq.gz" instead. if the barcodes are trimmed from you sample --fastq_pass might have some problems

omarkr8 commented 2 years ago

what would the file structure look like? isnt the --fastq option for single fastq inputs?

replikation commented 2 years ago

its a file input. for one fastq file per sample. if you want to use multiple samples --fastq "*.fastq.gz".

did you try to reduce the min length ? maybe you reads are quite short?

omarkr8 commented 2 years ago

i'll try to descibe the data.

the basecalled fastqs do contain reads of various lengths. generated from the midnight kit (V1200 +rapid barcoding). while there are shorter reads, plenty of 300-1000bp ones.

MinKnow generated about 3k fastqs for each barcode (96 samples), so to make it easier downstream, I cat each barcode so i had 96 fastqs. The content of the fastqs shouldnt have changed.

So im thinking either my command options are wrong, my MinKnow basecalling options (maybe barcoding) is incompatible with porecov, or the cat step did change something. for MinKnow basecalling, I followed the parameters in the nanopore protocol : "During the run setup, in the basecalling tab, enable Barcoding and in options, enable Mid-read barcodes and Override mid barcoding score. Minimum mid barcoding score should be set to 50 and Minimum barcoding score set to 60. "

ill try running porecov with the un-cat fastqs next.

omarkr8 commented 2 years ago

So running with basecall output fastq worked. was there something wrong with using cat? how would other people normally merge fastqs?

the porecov command that went through was : --fastq_pass fastq_dir/ --primerV V1200 --rapid TRUE --minLength 100 --minLength 1400 --min_depth 2 -r 0.110 --cores 2

replikation commented 2 years ago