File "/opt/anaconda3/envs/train-env/lib/python3.11/site-packages/autotrain/trainers/clm/utils.py", line 474, in configure_logging_steps
logging_steps = int(0.2 * len(valid_data) / config.batch_size)
^^^^^^^^^^^^^^^
TypeError: object of type 'NoneType' has no len()
Additional Information
This error is because in the line 444 of src/autotrain/trainers/clm/utils.py it set the follow
def process_data_with_chat_template(config, tokenizer, train_data, valid_data):
valid_data = None <------ this is the mistake
So, when I don't use chat template it throws an error instead of return the current valid_data.
Prerequisites
Backend
Local
Interface Used
CLI
CLI Command
nohup autotrain llm \ --train \ --model 'meta-llama/Meta-Llama-3-70B-Instruct' \ --project-name 'New-Llama-3-70B-Instruct-002' \ --data-path '/opt/dataset' \ --train-split 'train' \ --valid-split 'validation' \ --epochs 8 \ --lr 2e-4 \ --text-column text \ --peft \ --eval-strategy epoch \ --train-batch-size 1 \ --mixed-precision fp16 \ --quantization int4 \ --trainer sft \ --use-flash-attention-2 &
UI Screenshots & Parameters
No response
Error Logs
File "/opt/anaconda3/envs/train-env/lib/python3.11/site-packages/autotrain/trainers/clm/utils.py", line 474, in configure_logging_steps logging_steps = int(0.2 * len(valid_data) / config.batch_size) ^^^^^^^^^^^^^^^ TypeError: object of type 'NoneType' has no len()
Additional Information
This error is because in the line 444 of src/autotrain/trainers/clm/utils.py it set the follow
def process_data_with_chat_template(config, tokenizer, train_data, valid_data): valid_data = None <------ this is the mistake
So, when I don't use chat template it throws an error instead of return the current valid_data.