Closed ashercn97 closed 1 year ago
note I was doing the example/falcon/config-7b-lora.yml instead of the llama-3b one!
Please provide more details (errors etc). Try run the default open llama one.
@NanoCode012 Okay. I will run the openllama one and then copy and paste the error. Will try right now.
@NanoCode012 Can it be with a CPU runtime or should i only do GPU?
@NanoCode012 i got this error using the Llama one (I was using a CPU runtime):
Traceback (most recent call last):
File "/home/studio-lab-user/.conda/envs/python39/bin/accelerate", line 8, in
@NanoCode012 Now i got it to work with the default but when I try to use my own config file that is the same as the llama except I change the dataset i get this error:
component = fn(*varargs, **kwargs)
File "/home/studio-lab-user/axolotl/scripts/finetune.py", line 226, in train
train_dataset, eval_dataset = load_prepare_datasets(
File "/home/studio-lab-user/axolotl/src/axolotl/utils/data.py", line 393, in load_prepare_datasets
dataset = load_tokenized_prepared_datasets(
File "/home/studio-lab-user/axolotl/src/axolotl/utils/data.py", line 268, in load_tokenized_prepared_datasets
samples = samples + list(d)
File "/home/studio-lab-user/axolotl/src/axolotl/datasets.py", line 42, in iter
yield self.prompt_tokenizer.tokenize_prompt(example)
File "/home/studio-lab-user/axolotl/src/axolotl/prompt_tokenizers.py", line 116, in tokenize_prompt
tokenized_res_prompt = self._tokenize(
File "/home/studio-lab-user/axolotl/src/axolotl/prompt_tokenizers.py", line 64, in _tokenize
result = self.tokenizer(
File "/home/studio-lab-user/.conda/envs/python39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2571, in call
raise ValueError("You need to specify either text
or text_target
.")
ValueError: You need to specify either text
or text_target
.
died with <Signals.SIGKILL: 9>.
with CPU runtime, should mean RAM OOM
@NanoCode012 Thank you!
Fiogured out how to work it :)
@ashercn97 , could you please detail it for any future individual?
@NanoCode012 Yes ofc! I will write a step-by-step thing for how I did it.
SOME TIPS: Use the alpaca dataset format it wont work if you have missing values, so get rid of those to stop it, just do control c in the terminal and it will save it and start back up USE GPU RUNTIME
Hope this helped!
I am trying to use axolotl in amazon sagemaker studio, but I cannot figure it out. I am using Python 3.9 and running the quickstart code in the README, but it doesnt work. I am super new to this so if someone could help me thatd be great! (Sorry if this is a stupid question)