hiyouga / ChatGLM-Efficient-Tuning

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
Apache License 2.0
3.65k stars 471 forks source link

Dataset doesn't exist. #18

Closed michaeloo0 closed 1 year ago

michaeloo0 commented 1 year ago
FileNotFoundError: Couldn't find a dataset script at 
/content/JosephusCheung/GuanacoDataset/GuanacoDataset.py or any data file in the
same directory. Couldn't find 'JosephusCheung/GuanacoDataset' on the Hugging 
Face Hub either: FileNotFoundError: Dataset 'JosephusCheung/GuanacoDataset' 
doesn't exist on the Hub. If the repo is private or gated, make sure to log in 
with `huggingface-cli login`.
hiyouga commented 1 year ago

We reproduced this problem by running the following script:

CUDA_VISIBLE_DEVICES=0 python src/finetune.py \
    --do_train \
    --dataset guanaco \
    --finetuning_type lora \
    --output_dir test \
    --overwrite_cache \
    --overwrite_output_dir \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 1000 \
    --max_train_samples 10000 \
    --learning_rate 1e-4 \
    --num_train_epochs 1.0 \
    --fp16

And everything seems good. Please check your internet connection, or log in with huggingface-cli login.

hiyouga commented 1 year ago

We strongly recommend logging in with your HuggingFace account since this dataset requires confirmation before using it.