-
Hi, thank you so much for all your work. But when I want to reproduce the example, "2. Then we train a GraphSAGE on top of the generated embeddings:", here is an error.
Traceback (most recent call…
-
**Description**
Implement a Random Forest model to predict sales using the cleaned sales dataset.
**Tasks**
Data Preparation
**Load and preprocess the sales dataset.**
Handle missing values, …
-
i used augmentoolkit to generate sharegpt format dataset and trying to use "Llama-3 8b Instruct Unsloth 2x faster finetuning.ipynb". i am not good with coding so i dont know what to change in
`f…
-
When I check your configuration file, I see that dataset.split is "trainval". So, is the number of metrics reported in your paper based on this config file?
-
Sorry but I have a question. Which code lines or which script you used for split data into train, val and test set? I can't find it so I thought I need to ask you. Looking forward from hearing you soo…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and fou…
-
### System Information
Linux x86-64
Python 3.10.5
`sentence_transformers` 3.0.1
`transformers` 4.41.2
`datasets` 2.19.2
### Reproduction
Running on GPU:
```py
from datasets import load_data…
-
### Describe the bug
When I run the code en = load_dataset("allenai/c4", "en", streaming=True), I encounter an error: raise ValueError(f"Couldn't infer the same data file format for all splits. Got {…
-
**场景**:使用BGE-M3进行finetune,数据文件.jsonl 含有158000行记录,每行记录一个query,pos列表的长度为1,neg列表的长度为15。
**异常报错**:
WARNING:torch.distributed.run:
*****************************************
Setting OMP_NUM_THREADS envi…
-
Hi,
Thanks for providing the pre-training database with foldseek tokens! having difficulty downloading the dataset and using with hugginface functions. Trying
```
from datasets import load_da…