Closed mmolari closed 8 months ago
Oh cool!
My only worry is that positions in the result look suspiciously like u64 max :) Did we hit some edge case here?
I'd merge, but I see that you've added bunch of directories from the Julia branch to gitignore. They are not relevant to Rust version and this might bite us later - I will definitely miss that some files did not get committed at some point in the future, and might loose some of the work progress.
Here is a potential alternative: I clone rust
branch as a separate directory, this way I can have both Julia and Rust versions available at the same time, and I don't need to switch and worry about untracked directories specific to one or the other version. It's a complete rewrite, so switching the same local repo is very inconvenient. Do you think you can also adopt this strategy and revert the gitignore entries? Otherwise you could also try to add them to your local gitignore file called .git/info/exclude
, where .git
is the directory in the root of the checked out project.
Thanks for all of the feedback!
.git
file you are right. I removed the changes and I cloned the repository again in another folder to keep things clean.position
attribute is a bitpacked container for sequence id (first 32 bytes) locus (next 31 bytes) and strandedness (first bit). We might want to simplify this later on but for now I left it, so that we can compare directly with the Julia output.I also added some more corrections. They are all due to the fact that:
a:b
in julia include the extremes while in rust a..b
excludes b
.
These corrections should only be relevant in edge cases. I could not trigger an error with small examples.I also fixed some small bugs in the mash_distance
function and added some non-trivial tests. I think we might have still some small conceptual problems here:
I fixed some bugs in the implementation of minimizer sketching and added nontrivial tests. The bugs were simply due to the fact that Julia has 1-based indexing, while rust is 0-based. I hope I
catchedcaught all of them. The test were generated by running the original julia code on a random 100-nucleotide sequence, and getting the resulting minimizers.