-
### Question
ShareGPT is used for instruction fine-tuning, with the aim of inserting data from image independent pure text conversations into multiple rounds of image conversations, so that the model…
-
**Describe the bug**
Generating larger datasets with `LoadDataFromDicts` leads to underutilization of the GPU during the `TextGeneration` step.
**To Reproduce**
Setting `N_SAMPLES` to a smal…
-
### Describe the bug
when I use load_dataset methods to load mozilla-foundation/common_voice_7_0, it can successfully download and extracted the dataset but It cannot generating the arrow document,…
-
> Traditional OCR datasets can be transformed into instruction-following datasets. For example, in the traditional OCR dataset, a data sample is an image with OCR ground truths.
>
> W…
-
I am planning to fine-tune the VideoChat2 model with custom instruction data to enhance its performance on downstream tasks. I have a couple of questions regarding the pre-training data and the proces…
-
Updated 2024-07-01.
Datasets:
- Used for evaluation:
- MMLU: https://huggingface.co/datasets/hails/mmlu_no_train
- ARC-Challenge: https://huggingface.co/datasets/allenai/ai2_arc
- HellaSwag: h…
-
Hello, I wanted to express my gratitude for your work; it's been instrumental in my current project. However, I've encountered some confusion that I'm hoping you can shed light on.
Now,I want to trai…
-
Add instructions/tools to parse the data from the datasets into graphs and visual representations of the data shown in the video, e.g the sentiment analysis
-
Great work - do you have any details on the exact datasets used and where to get them?
-
Thanks for your excellent work. I am curious if there are any instructions for fine-tuning video-llava on my own dataset?