Seems like checkpoints for {beta=0, beta=0.5} latent size=32 are the same checkpoints

ChunyuanLI / Optimus

Optimus: the first large-scale pre-trained VAE language model

378 stars 37 forks source link

For the following two checkpoints listed in optimus_finetune_language_models.md:

beta=0, latent size = 32 https://chunylcus.blob.core.windows.net/machines/msrdl/optimus/output/pretrain/philly_rr3_vc4_g8_base_vae_wikipedia_pretraining_beta_schedule_beta0.0_d1.0_ro0.5_ra0.25_32_v2/checkpoint-508523.zip

beta=0.5, latent size = 32 https://chunylcus.blob.core.windows.net/machines/msrdl/optimus/output/pretrain/philly_rr3_vc4_g8_base_vae_wikipedia_pretraining_beta_schedule_beta0.5_d1.0_ro0.5_ra0.25_32_v2/checkpoint-508523.zip

Their sums of all parameters are the same. So I think they are the same checkpoints. Could anyone please double-check this?

Btw, thanks for publishing your work on github.

Hello. I wonder if you have downloaded the processed wiki dataset for training? If you can share an available link, I would appreciate it vary much! Look forward to your response.

ChunyuanLI / Optimus

Seems like checkpoints for {beta=0, beta=0.5} latent size=32 are the same checkpoints #27