Hi,
When training the depthformer model a batch size > 1 yields to several errors.
I am getting the impression that the code is not meant to handle any other batch sizes (except for data parallel training with mutiple gpus)
Or am I missing something?
Thanks for your work and best regards,
Jonas
Hi, When training the depthformer model a batch size > 1 yields to several errors. I am getting the impression that the code is not meant to handle any other batch sizes (except for data parallel training with mutiple gpus) Or am I missing something? Thanks for your work and best regards, Jonas