nf-core / kmermaid

k-mer similarity analysis pipeline
https://nf-co.re/kmermaid
MIT License
19 stars 12 forks source link

Add hash2kmer.py code to get potential genes and sequences for each k-mer hash #134

Open olgabot opened 3 years ago

olgabot commented 3 years ago

Is your feature request related to a problem? Please describe

Currently, there is no automated way to go back to the sequence creating the hash for each k-mer. There is code to do this, like here but it is not yet integrated into kmermaid. Since there are often questions about what k-mers contribute to the cell type, having the underlying sequence and read associated with each k-mer would be very useful.

Describe the solution you'd like

Add hash2kmer as a process, performing it on each cell's signatures individually. For bam input, keep aligned and unaligned separate.

Describe alternatives you've considered

The way I do this right now is in a hacky Jupyter notebook.