Open bluegenes opened 7 months ago
Just remembered that scaled
is in minhash
, not signature
, so we don't have access without loading the minhash. So this would mean no scaled
selection in signature::select
, which is probably not desirable.
At the moment, we use
signature::select
to downsample minhashes if needed. However, this means we load the minhash during theselect
, which may happen duringsig_for_dataset
, for example, and then discard it, returning just the signature. That means the minhash needs to be loaded again by the user.We can avoid loading twice by moving downsampling into the
.minhash()
or.get_sketch()
methods instead. What do you think @luizirber?the downside is that there are other ways to get the minhash, e.g.
.sketches()[0]
^, and those wouldn't have the downsampled signature even after having run select on the signature.^ I know we're thinking of deprecating
sketches
, but there probably other ways too?