Open arpaiva opened 5 months ago
For additional context, the command and arguments used were:
python -m llm_rl_scripts.car_dealer.bc.train_bc \
HF ${LM_MODELS_DIR}/gpt2-large \
${DATADIR}/car-dealer/simulator/model \
--outputs-path=${WORKDIR}/car-dealer/outputs/ \
--data-path=${DATADIR}/car-dealer/ \
--exp_name car-dealer-bc \
--epochs=18 \
--train-bsize=16 \
--grad-accum-steps=8 \
--inference-bsize=32 \
--num-logs-per-epoch=4 \
--num-evals-per-epoch=4 \
--save-best \
--save-last
There are exactly three changes compared to what is given in llm_rl_scripts/car_dealer/misc/test_car_dealer.sh:
gpt2-large
is used instead of gpt2-xl
due to memory limitations,--exp_name
because the script requires it, and--model-p-shape
as that is an unrecognized argument.BTW, ${DATADIR}
refers to the local copy of the datasets.
I'm trying to reproduce the
car-dealer
results using the commands in llm_rl_scripts/car_dealer/misc/test_car_dealer.sh. I'm unable to train BC with the code. There were syntax issues with the code in the main branch which I believe I corrected in PR #18. Even after applying those changes, I had to remove--model-p-shape=4
from the BC command as that is an unrecognized argument; not sure how important that is.Still, the code quickly ends with:
and it doesn't generate any output or checkpoint. Where is the BC finetuned model being trained then?