-
### CKAN Version if known (or site URL)
https://demo.ckan.org
### Please describe the expected behaviour
Resource downloads are tracked.
### Please describe the actual behaviour
Resource down…
-
I've noticed that a `CRITICAL ERROR` message can be thrown in a number of different cases, but this doesn't seem to always result in a non-zero exit code for sanoid. By "silently" failing with 0, it's…
-
Thanks for your sharing.Is the "--data_filename" means you had used real datasets to train the model?If it is true, can you share the datasets with us please?
-
Hi. My goal is to finetune a large BERT-based MT model (e.g. `NLLB-200-1.3B`) on new words that are out of model's vocabulary.
I managed to finetune it only from a technical point of view, without p…
-
## 1980
Analysis was based on breast and ovarian cancer incidence in female first-degree relatives of the index patient.
* All individuals were censored at age 80 because incidence rates may be…
-
It appears that the EML specification defines a namespace for the root element of a document, but proceeds to use the empty namespace for every other element. This is very weird, and makes the use of …
-
One of tobac's advantages is that it keeps each step in the tracking process separate. Right now, each step using tobac's various tracking functions produces a separate dataset or array.
However, a…
-
## CKAN version
2.10
## Describe the bug
HTML headings are inconsistent in several places, mostly where snippets are (re-)used at different hierarchy levels.
This constitutes an accessibility …
-
https://github.com/huggingface/cosmopedia/blob/main/deduplication/deduplicate_dataset.py
```
2024-02-22 14:17:57.759 | INFO | datatrove.executor.slurm:launch_job:216 - Launching dependency job…
-
Hi all, thank you for releasing great work with all the details!
I have an issue downloading the ultrachat 64k dataset used during ProLong SFT (https://huggingface.co/datasets/princeton-nlp/prolong…