Closed williamd4112 closed 4 months ago
Hi @williamd4112 you can set
"export TOKENIZERS_PARALLELISM=false"
before running your script.
To reduce RAM constraints, you can decrease batch size and increase grad_accum_steps accordingly.
python llm_rl_scripts/maze/ilql/train_ilql.py PARAMS bc_checkpoint_path PATH_TO_YOUR_DATA --outputs-path ilql_checkpoint_path --train-bsize 8 --grad-accum-steps 16
Also if you have multiple GPUs or TPUs you could consider using data parallelism by setting --data-mesh-shape {num_devices}
I can train BC and eval BC, but when I run the following command to fine-tune ILQL, I got (possibly) memory error.
Error:
I suspect this is because I'm running out of RAM. Any hyperparmeter setting to reduce RAM requirement?