Open StrongTanisha opened 10 months ago
Failed - the XL and S were saving to the same checkpoint file
experiment_name="timm-efficientnetv2_s" gpu_type="24GB VRAM GPU" nnodes = 11 venv_path = "/mnt/Client/Strongzpnpupxvdfdllpjvckewupy3re/becstrlaxex7elmnesblpq7jurqemkbu/.venv/bin/activate" output_path = "/mnt/Client/Strongzpnpupxvdfdllpjvckewupy3re/becstrlaxex7elmnesblpq7jurqemkbu/output_timm" command = "train_cycling.py /mnt/.node1/Open-Datasets/imagenet/ILSVRC/Data/CLS-LOC --model=efficientnetv2_s --weight-decay=1e-5 --decay-rate=0.03 --decay-epochs=2.4 --bn-momentum=0.99 --epochs=350 --lr=0.256 --batch-size=62 --amp --resume $OUTPUT_PATH/small_checkpoint.pt"
Source / repo
[URL]
Model description
[DESCRIPTION]
Dataset
[DATASET]
Literature benchmark source
https://arxiv.org/abs/2104.00298
Literature benchmark performance
[DESCRIPTION] [VALUE/S] https://github.com/huggingface/pytorch-image-models/blob/main/results/results-imagenet.csv#:~:text=tf_efficientnetv2_s.in1k
Strong Compute result achieved
[VALUE/S]
Basic training config (as applicable)
Nodes: 12 Epochs: 350 Effective batch size: [N] Learning rate: [L] Optimizer: [OPT]
Logs gist
[URL]