-
When demultiplexed with a read structure indicating presence of UMIs (including `M`; ex. `146T8B9M8B146T` for a 9bp UMI), the resulting bam files include per-read UMI sequences via the [RX tag](https:…
-
**Thanos version used**:
- Thanos: `quay.io/thanos/thanos:v0.34.1` (deployed in K8s)
**What happened**:
We are encountering an issue with duplicate stores being detected in our Thanos Query set…
-
In my [generals talk](https://www.youtube.com/watch?v=M8i2HKEnoqI), I proposed deduplicating "instructions" from a hardware design using a set of rewrites:
The rewrites looked like, e.g.:
```
(…
-
Cover some common approaches to backup filesystems to S3. See also #49.
Would be good to mention backup options/tools.
- https://github.com/zbackup/zbackup (deduplicating backups, inspired by rsync, …
jlevy updated
5 years ago
-
### Proposal
with the introduction of RW2.0 we've duplicated a lot of code to handle reduction of allocations etc. nicely given the existing codepath depended on/assumed the previous proto format
…
-
I suggest comparing your solution with other deduplication solutions, such as borg, casync, desync, rdedup. Compare not only size of deduplicated and compressed data, but also speed of creating dedupl…
-
Our IndexOf{Any} (and some LastIndexOf{Any}) implementations all have a scalar path that's used when vectorization can't be, either because the current platform doesn't support it, the target type doe…
-
### Description
1. `from employees | stats cd1=count_distinct(salary, 3000), cd2=count_distinct(salary, 3000 + 1000 - 1000), cd3=count_distinct(salary, 1000)`
fails with
```
"type": "illega…
-
### Describe the bug
Same bug as https://github.com/dbt-labs/dbt-utils/issues/713 however for Spark
When there is a null column(Unrelated to the partition by or order by columns), Spark doesn't retu…
-
# Overview
See [record linkage design doc](https://docs.google.com/document/d/1GPANDNf01Lc_oU8FSk3RUkByZta-bgb3dEUuvQrHRjk/edit?tab=t.0) for diagram and more notes.
We want to conduct record lin…