hendrycks / math

The MATH Dataset (NeurIPS 2021)
MIT License
842 stars 77 forks source link

ValueError: not enough values to unpack (expected 2, got 1) #11

Open lazir0lufias opened 2 years ago

lazir0lufias commented 2 years ago

Hi, sorry im new in this field.

!python tune_gpt.py --khan-dataroot /content/amps/khan/ --save-dir /content/drive/MyDrive/model/

when i using the above code on google colap, i got error

Traceback (most recent call last): File "tune_gpt.py", line 333, in main() File "tune_gpt.py", line 318, in main train_data = get_dataset(args) File "tune_gpt.py", line 239, in get_dataset len_multiplier, dirname = args.khan_dataroot.split("@") ValueError: not enough values to unpack (expected 2, got 1)

How to fix this?

hendrycks commented 2 years ago

On colab there must not have been a directory name, as directories on drive might be structured unusually.

Best, Dan Hendrycks

On Tue, Jun 21, 2022 at 7:07 PM lazir0lufias @.***> wrote:

Hi, sorry im new in this field.

!python tune_gpt.py --khan-dataroot /content/amps/khan/ --save-dir /content/drive/MyDrive/model/

when i using the above code on google colap, i got error

Traceback (most recent call last): File "tune_gpt.py", line 333, in main() File "tune_gpt.py", line 318, in main train_data = get_dataset(args) File "tune_gpt.py", line 239, in get_dataset len_multiplier, dirname = args.khan_dataroot.split("@") ValueError: not enough values to unpack (expected 2, got 1)

How to fix this?

— Reply to this email directly, view it on GitHub https://github.com/hendrycks/math/issues/11, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZBITT7QYZZD52LS2UTWITVQJYOJANCNFSM5ZOQHKGA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

lazir0lufias commented 2 years ago

free drive just 15gb but amps 23gb, how can i edit the scripts so that can be trained on colap ?

gcalabria commented 1 year ago

I have the same problem here. I am running it on my machine, not on colab. Any ideas?

Update: I figured out what was causing this problem. You were probably passing the argument as a string. For example:

python t5_tune.py \
  --mathematica-dataroot "/home/gui/dev/t5math/data/amps/mathematica/*/*/*.txt"

However, it should be passed as a path (i.e., without the quotes):

python t5_tune.py \
  --mathematica-dataroot /home/gui/dev/t5math/data/amps/mathematica/*/*/*.txt

The problem now is that I am getting an error zsh: argument list too long: python, which I believe is caused because there is simply too many files in the corpus.

ayaka14732 commented 1 year ago

I believe that dataroot is the path to the directories, i.e. /home/gui/dev/t5math/data/amps/mathematica, not a list of files.

ayaka14732 commented 1 year ago

I understand the problem now. It should be

python t5_tune.py \
  --mathematica-dataroot="/home/gui/dev/t5math/data/amps/mathematica/*/*/*.txt"

in your case.