Closed w5688414 closed 4 years ago
We did not put adma_* params in the released model to save disk space. Please check https://github.com/google-research/electra/issues/45 for more details.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Closing since not updates.
error in loading checkpoints for pretraining, adam_m is missing?