-
In the [marker API of my `fxprof-processed-profile` crate](https://docs.rs/fxprof-processed-profile/latest/fxprof_processed_profile/trait.ProfilerMarker.html), I'm currently asking users to list `"typ…
-
### Description
I'm working with large amounts of data (sometimes more than 100 of GB) which contain timestreams. Within the timestreams, there are interesting events I would like to look at. I want …
-
Reducing heap usage while building completed segments. Currently, the segment builder is designed to read incoming data row by row, and build dictionaries in a hash table before translating them to th…
-
Storing data in hive style partitions is very common use-case while writing data in columnar formats to object-stores.
It would be great if the library adds support for the following features wrt to…
-
Currently implementation includes value hashes in the key space so that:
1. We could avoid conflicting updates.
2. We could avoid having to read before we write.
Tradeoff is that a 32 byte over…
-
# Data Format Changes
## Introduction:
The current data format used in our application has some limitations, such as a lack of clarity and the need for expensive calculated columns to be generated…
-
I have JSON data where the columnar (line-delimited) part is in a `data` subkey:
```java
{
"metadata": {"name": "block1"},
"data" : [
{"a": 1, "b": 2.0, "c": "foo", "d": false},
{"a…
-
We want to be able to store JSON log events in Pinot so that they can be queried efficiently and so that we can reduce storage costs. Part of this involves encoding unstructured message fields in the …
-
**Use case**
When storing metrics data in clickhouse one of the biggest problems encountered is cardinality. This is a very common problem related to storing metrics data in any columnar store. In …
-
🧠 rewind
✔️ parquet file[🔗](https://github.com/seoyeong200/LeetCode/issues/15#issuecomment-2412094885)
\- with spark read performance
🀄️ b-tree, b+tree, isolation level [🔗](https://github.com/s…