UPHL-BioNGS / Donut_Falls

Basic workflow for nanopore sequencing data
MIT License
14 stars 1 forks source link

Add check that Illumina and Nanopore reads are from same isolate #60

Closed erinyoung closed 2 months ago

erinyoung commented 3 months ago

I'm thinking mash dist can do this

cat illumina*.fastq | mash sketch -m 2 -o illumina -
mash sketch -m 2 nanopore.fastq -o nanopore
echo -e "illumina\tnanopore\tmash-distance\tP-value\tmatching-hashes" > mash_dist.txt
mash dist illumina.msh nanopore.msh >> mash_dist.txt
distance=\$(cut -f 3 mash_dist.txt | tail -n 1)

Unicycler and polishing will then have a 0.01 limit to proceed. Perhaps Illumina reads get filtered out if the limit is too low.