-
@idreeskhan pointed me to Space Saving and other variants, approximate algorithms that can answer top K items & frequencies. Could be nice to have.
Right now we have Count-Min Sketch from Algebird …
-
Hello, this might have a very obvious question, but I've plotted my magnetometer values on a 3D scatter graph, the next step in the wiki is the use the code you state, which is this:
void magcalMPU9…
-
## Problem Statement
Add support for data skipping indexes.
## Background and Motivation
Hyperspace has been supporting hash-partitioned covering indexes only. Covering indexes are good for s…
-
- Check how to improve elasticsearch's performance
- Build a pre-indexer that filters out data that has been indexed for a given column. Basically this requires a count-min sketch per column, so that …
-
when I run this:
singularity exec $pggb_path/pggb_latest.sif wfmash -t 20 $genome_path/ganganF73.genome.fa.gz --query-file-list=/home/user/huyang/shuai/data/pan_test/genome/genome.txt > aln.paf
I …
-
Hello! I’m want to write some code to normalize hash abundances in a signature file by dividing each abundance by total # hashes in a signature. The goal is to compare signatures from different sample…
-
## Enhancement
Currently, after we import data to the cluster, we need to analyze the table, which is time-consuming since it needs to scan the whole table. Collecting table statistics can be done …
-
It would be good to make a few changes for Python 3.
`range` instead of `xrange` and `print` as a function.
In lines 49 to 56:
```
for _ in range(d):
table = array.array("…
-
The vast majority of khmer's functionality depends on two data data structures (*sketches?*): *Bloom filter* and *Count-min sketch*. And *REALLY* these are pretty much the same data structure at the c…
-
It appears that the term "caret", which describes the `^` symbol, is not currently in the build123d docs.
From an example I stumbled across, it appears that the caret operator moves a plane/flat o…