tlemane / kmtricks

modular k-mer count matrix and Bloom filter construction for large read collections
GNU Affero General Public License v3.0
68 stars 7 forks source link

Clarification needed for kmer matrix columns #29

Open Louis-MG opened 8 months ago

Louis-MG commented 8 months ago

Hello, I needed a matrix of the kmer counts for several samples. I followed the instructions given in the example of the documentation, but I don't see specified the correspondence between samples and columns, neither in the previous link nor in the documentation for the aggregate module. I would guess it follows the order given by the file of file ? I hope I didnt miss anything. Thank you !

nikostr commented 1 month ago

I also looked for some info on this, and found the following at https://github.com/tlemane/kmtricks/wiki/IOs-API#1a-stream-it-1

std::vector<count_type> counts(reader.infos().nb_counts); // count order follows the sample order in the input fof

When I've dumped the matrix previously it also behaves in a way that would be compatible with this.