NVIDIA-Genomics-Research / GenomeWorks

SDK for GPU accelerated genome assembly and analysis
https://clara-parabricks.github.io/GenomeWorks/
Apache License 2.0
285 stars 76 forks source link

[cudamapper] Create public interface for IndexDescriptor class #536

Closed nvvishanthi closed 4 years ago

nvvishanthi commented 4 years ago

closes #517

nvvishanthi commented 4 years ago

@mimaric I'm still working through refactoring the caching code to use the public interface as well as removing an instance of code duplication for group_reads_into_indices(), but I wanted to make sure that whatever I moved to the public interface is ok.

nvvishanthi commented 4 years ago

All it does is provide first_read and number_of_reads values. It also generates a hash from the two, but that doesn't seem to be used anywhere, or is it?

@ahehn-nv the hashing stuff is specifically in caching code (see cudamapper/src/index_cache.cu), so how we cache indices would need some modification.

I would support making this a simple struct (I think we could also stuff the hash-related parts into the struct, right?).

Alternatively, we could get rid of first_read and number_of_reads in the public interface and make create_index() take an IndexDescriptor as parameter instead

If we can reconcile this with the index caching that requires the hash, then this is also fine. I'll leave it up to @tijyojwad and @mimaric as to what the best solution would be

nvvishanthi commented 4 years ago

rerun tests

ahehn-nv commented 4 years ago

@ahehn-nv the hashing stuff is specifically in caching code (see cudamapper/src/index_cache.cu), so how we cache indices would need some modification.

Ok, if the hash is not removed, then it should be a class imo, since the data in the hash is tightly coupled to the other two elements. So let's leave it as it is.

mimaric commented 4 years ago

@nvvishanthi IndexDescriptorHash has to be a separate struct/class unless we want to have IndexDescriptor::operator() return hash value, which sounds confusing.

@ahehn-nv I like the idea of Index accepting IndexDescriptor instead of first_read and number_of_reads explicitly, but it should probably be done in a separate PR. I created #539 to track it.