-
Similar to the `stats --stats-jsonl` option, create a `frequency --frequency-jsonl` option.
The `frequency cache` will have the COMPLETE frequency table for each column, not just the top N (default…
-
I follow logs of a Kubernetes pod. Those logs are in `jsonl` format (https://jsonlines.org/), where each line is a valid and complete JSON entry.
I see `jsonl` support is added here: https://gi…
aliok updated
1 month ago
-
id2metric长度为4531,而test_data.jsonl为4564,二者长度不一致,导致有些生成的例子无法被评分。
-
https://github.com/instructlab/instructlab/blob/a21129abc8badf4b3a699273e503fa9e9745e664/README.md?plain=1#L699
Walking through the steps on my M3 Mac, attempting to run `ilab model test` this happ…
-
I hope this message finds you well. My name is HOJIN LEE, and I am currently an undergraduate researcher at a university in South Korea. I am working with the CrysTens repository, specifically trying …
-
Caro Ruggero,
è molto più comodo, sia per fare debug, sia fare grafici, mappe, ecc. avere tutto in long, in pochi campi.
L'ispirazione mi è venuta vedendo un sacco di nomi campi con spazi ("Partit…
-
**Describe the bug**
When finetuning a Llama 3.1 Instruct LLM, the memory consumption increases from 90 GB to over 200 GB after approx. 60 iterations. Because of this my M3 Max Macbook with 128GB mem…
-
```
2024-09-01 15:00:25.122 | INFO | datatrove.executor.local:run:120 - Skipping 4095 already completed tasks
2024-09-01 15:00:25.772 | INFO | datatrove.utils.logging:add_task_logger:58 - La…
-
I am running against v0.1.0.
I have a type definition in the json schema called `ResourceDescriptor`. It should require at least one of `content`, `digest`, `uri` fields to be set. However, what I…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
`def create_piazza_index(json_file_path, index_folder, levels_back=None, collapse_length…