Open GuangtaoLyu opened 7 months ago
Hi,
The default parameters are the parameters we used for training. Except the batch size, we use 512 since we use 4 gpus for training therefore we decrease total-iter
, lr-scheduler
, and eval-iter
by 4 proportional to the increasing batch size. Learning rate is still the same. Here is the batch script we use.
name='HML3D_45_crsAtt1lyr_20breset'
vq_name='2023-07-19-04-17-17_12_VQVAE_20batchResetNRandom_8192_32'
export CUDA_VISIBLE_DEVICES=0,1,2,3
MULTI_BATCH=4
python3 train_t2m_trans.py \
--exp-name ${name} \
--batch-size $((128*MULTI_BATCH)) \
--vq-name ${vq_name} \
--out-dir output/${dataset_name} \
--total-iter $((300000/MULTI_BATCH)) \
--lr-scheduler $((150000/MULTI_BATCH)) \
--dataname t2m \
--eval-iter $((20000/MULTI_BATCH))
Hello, Can you share the hyper-parameters you used in training? I want to do a simple replay. thank you.