Can you share the hyper-parameters you used in training? I want to do a simple replay.

Hi, The default parameters are the parameters we used for training. Except the batch size, we use 512 since we use 4 gpus for training therefore we decrease total-iter, lr-scheduler, and eval-iter by 4 proportional to the increasing batch size. Learning rate is still the same. Here is the batch script we use.

name='HML3D_45_crsAtt1lyr_20breset' 
vq_name='2023-07-19-04-17-17_12_VQVAE_20batchResetNRandom_8192_32'
export CUDA_VISIBLE_DEVICES=0,1,2,3
MULTI_BATCH=4

python3 train_t2m_trans.py  \
    --exp-name ${name} \
    --batch-size $((128*MULTI_BATCH)) \
    --vq-name ${vq_name} \
    --out-dir output/${dataset_name} \
    --total-iter $((300000/MULTI_BATCH)) \
    --lr-scheduler $((150000/MULTI_BATCH)) \
    --dataname t2m \
    --eval-iter $((20000/MULTI_BATCH))

exitudio / MMM

Can you share the hyper-parameters you used in training? I want to do a simple replay. #6