Open ecpierce opened 1 year ago
Hi Emily,
nanodisco
was implemented using unfiltered input fast5 so that it can handle most situations users will face. In practice, I would consider two things for whether using pass
only or all data. First, if the coverage is limited with pass
only reads then adding the remaining reads could help. Second, if you observe enrichment of fail
reads for certains regions of interest. From my experience, I do not expect that using only pass
reads can miss motifs considering that methylation motifs have many occurrences across the genome but there might be rare situations that I'm not aware of.
I hope this helps.
Best,
Alan
Hi!
I am currently re-basecalling my raw fast5 files using the fast5_out option so that I can input them into nanodisco. I am wondering if you recommend basecalling fast5 files from both the fast5_pass and fast5_fail folders that Guppy creates during live base calling, or if I should only use the fast5_pass files? Would you expect that interesting modifications (not necessarily just methylation of specific residues) would lead to reads with lower quality scores on average and therefore maybe fast5_fail is also interesting?
Thanks! Emily