Open remiolsen opened 10 months ago
Part of this will mode will rely on #59 to determine which adaptor template(s) to map to. Anglerfish CLI will have to be modified to run without input samplesheet. And more importantly it needs to cluster the index hits (unmatched indices in current anglerfish terminology) with some allowance for edit distance. And finally there needs to be some measure of confidence called for each cluster. We will have to put on our statistician hats 🙀
A separate or parallel step to the main anglerfish algorithm that tries to classify reads not based on the adaptors and index setups found in the input samplesheet, but tries to cluster reads based on all known adaptors and setups. This would be useful especially for cases where we have "unknown unknowns" / unicorns in our sequencing pools.