-
Hi and thanks for your contributions. Could you upload the script generate_dysfluency.py or the simulated datasets?
-
Hi, thanks so much for this great work!
I may be missing someting but in the training pipeline, I could not find any codes on `generate_dataset`, `generate_GPT_MistralRS` and the `temp` dataset.
…
-
Hello, I would like to know which dataset you used to train the SDM.
During training, was it conditioned generation or unconditional generation?
In the original SDM code, both the training and sam…
-
**Describe the bug**
I am experiencing an issue when trying to use a `RoutingBatchFunction` inside a pipeline. Specifically I am using `sample_n_steps()` as shown as an example here: https://distilab…
-
Hi @mmaaz60, thanks for sharing this great work!
I was wondering whether you have a plan to share the code for semi-automated dataset generation (the pipeline of using Katna to extract keyframes ->…
-
First of all, thanks for your contribution. I am using your model for dataset rebalancing, and with datasets with low and imbalanced samples I am facing a problem. Generation fails due to reaching the…
-
As a followup to #163, we need to figure out the right way to wire a precomputed dataset into the skills data generation. One example of such a dataset is https://github.com/instructlab/training/blob/…
-
Sometimes , ds tasks need trained data from web page , or generated by llm.
data-interpreter should determine to get useful trained data from webpage,
or generate useful data by it self.
I mean i…
-
**Deliverable this task is associated with**
Processed datasets for lipidomics datasets for nmdc:sty-11-dcqce727
**RACI**
_Tag people in their roles_
- Responsible: @kheal
- Accountable: @l…
-
**Deliverable this task is associated with**
Processed datasets and associated metadata submitted to mongo for lipidomics datasets for nmdc:sty-11-aygzgv51
**RACI**
_Tag people in their roles…