Closed NateBaileyTempus closed 1 year ago
Hey Nate, can you have a look at https://github.com/onecodex/finch-rs/pull/64?
Hope you're doing well!
Oh! HashMap probably a way better solution! Doing good, will check it out and let you know if it runs without error!
It will be slower than the current code though, we don't have benchmarks sadly :/
This new branch worked like a champ on the files that failed with the previous iteration! I had four files fail with the same error but with different vector lengths over the 65k and this time they all worked!
Issue when running large files through finch u64 vector is constrained by u16 length of 65536 resulting in the error:
thread 'main' panicked at 'index out of bounds: the len is 65536 but the index is 126497'
suggested fix is to increase the length of this vector u32 would do for all file types and file sizes
pub fn hist(sketch: &[KmerCount]) -> Vec<u64> { let mut counts = vec![0u64; 65536]; let mut max_count: u64 = 0; for kmer in sketch { max_count = cmp::max(max_count, u64::from(kmer.count)); counts[kmer.count as usize - 1] += 1; } counts.truncate(max_count as usize); counts }