Open stefan-it opened 1 year ago
Hey,
Here's a rough list, let's hope I don't forget any important point:
google/electra-base-generator
and google/electra-base-discriminator
lr=8*10^-5
)We have a publication about it but sadly it is in hungarian. If you need to know anything else feel free to ask.
Probably we trained them on 2 x NVIDIA RTX 2080 Ti, I'm not entirely sure but I'm going to check it
Yep, we used 2 x NVIDIA RTX 2080 Ti at that time
Hi @ficstamas ,
many thanks for open sourcing this very interesting implementation!
I would like to train own models with this implementation (as additional models to my ByT5 project on historic texts), so I wanted to know if you could give feedback about the hyperparameters that were used for pretraining this Hungarian model :thinking:
I would also be interested in number of GPUs used for pretraining and pretraining time for this model.
Many thanks in advance!