Closed leoisl closed 1 year ago
Looks straightforward to me. I was a bit worried where you pass in the sample id fixed to zero in add_clusters_to_pangraph, but decided this must be because you only ever map one sample.
Yeah, it is exactly this. I had the same worry when I first read the pandora code, but then later it made sense too!
@Danderson123 i think it is very hard for you to review this; when I am cavalier and skip over things I am relaxed about it, but you might be overwhelmed by all this code. If this is not merged by tomorrow, maybe we can talk you through how we look through this kind of PR and what we look for.
This is a clean up PR. It removes two unused modules:
The reasoning for removing these modules are:
pandora
, and include a gene-DBG module and a noise-fitlering module;This module also removes read-information bookkeeping code, which improves RAM usage by ~6x with even minor improvements to results (see https://github.com/rmcolq/pandora/issues/330#issue-1752440505).
There is a small follow-up PR to this one, fixing some small issues and CLI, but I wanted to keep separated from this one, which is the large cleanup.