Open collinmccarthy opened 1 month ago
Hi Collin,
So bad news/good news on this one. Unfortunately, the only pretrained E-RADIO we have right now is the e-radio_v2
one. However, we're putting together a training pass right now that will result in some new models. The first of which will be a ViT-B/16, most likely followed by a ViT-L/16, ViT-H/16, and a new E-RADIO. There was some demand from other people on an xxxtiny as well, so we can try to include it. What's your timeline for you work?
Hi Mike,
That sounds great, thank you. If a pre-trained xxxtiny model were available anytime in June, or even early-mid July, that would be a huge help for me personally. For now I can use the existing e-radio model. I'm assuming the new E-RADIO model will include an update to the source code / model definition so I can track down the changes that were made, but if not I would really appreciate that as well.
Hi, my query is kinda related to this so I'll just put it here: are there any particular issues with training E-RADIO using bfloat16? I want to fine-tune E-RADIO for downstream tasks using bfloat16 and I'm trying to figure out if this is feasible or not.
Hello, you should be good with E-RADIO and bfloat16. I did notice some issues with float16 (NaN) but none with bfloat16 and its wider range.
Hello,
In radio/eradio_model.py I see quite a few model definitions for the hybrid/e-radio variants:
fastervit2_large_fullres
(ws7)fastervit2_large_fullres_ws8
fastervit2_large_fullres_ws16
fastervit2_large_fullres_ws32
eradio_xxxtiny
(ws16)eradio_xxxtiny_8x_ws12
eradio_xxxtiny_8x_ws16
And then I see the
eradio
model is a wrapper aroundfastervit2_large_fullres_ws16
, which matches what I see in radio/common.py.Are there any other pre-trained e-radio models available right now or is it just this one "e-radio_v2"? I'm particularly interested in the (best performing)
eradio_xxxtiny
variant (probably one of the ws16 versions for a fair comparison). I'm assuming the older "eradio_v1" version is the same architecture as "e-radio_v2", but please correct me if I'm wrong.Thanks!