HazyResearch / safari

Convolutions for Sequence Modeling
Apache License 2.0
848 stars 70 forks source link

How to reproduce the Hyena-Imagenet experiment result simillar with the paper? #46

Open Diagreen opened 4 months ago

Diagreen commented 4 months ago

Hello all, I am trying to reproduce the Imagenet task with your great work! After the installation is done with the procedure with the repogitory, I try the command ''python -m train wandb=null experiment=imagenet/hyena-vit" on the experiments.md.

In my system, there are 4ea of A6000, so I changed the number of device from 8 to 4 in hyena-vit.yaml At the last epoch ends, I get the message just like the below.

Epoch 304: 100%|█| 2893/2893 [20:58<00:00, 2.30it/s, loss=4.29, val/accuracy=0.685, val/accuracy@5=0.885, val/accuracy@10=0.927, val/loss=1.510, train/accuracy=0.452, train/accuracy@5=0.667, train/accuracy@10=0.7

In my recognition, the val/accuracy should be similar with 79.8 that the result on the paper. Is there any fault while the training step? the attachment is training config file, If there any mistake that I did, please let me know about it...

Or, is it possible to request the checkpoint file of vit_hyena?

Thank you. config_tree.txt

exnx commented 4 months ago

Hello! The first thing to change is: num_inner_mlps = 2