Open sanky29 opened 2 months ago
Hi Sanket,
Thank you for the query. The performance is susceptible to the training seed and it is advisable to train the model with 4 different seed values (but the more the better). In your reported results, the standard deviation seems a bit too low and my hunch is that enough seeds were perhaps not run for training.
Hope it helps.
Hi Gautam,
Thank you for your timely response and the helpful suggestion. We will proceed with training the models using different random seeds, as you recommended.
Hi Gautam,
After training the models using different random seeds, we obtained the following results:
DCI Disentanglement: mean 0.7673, std 0.0634 Completeness: mean 0.5569, std 0.1137 Informativeness: mean 0.97134, std 0.0045
While investigating further, we noticed two potential issues:
After correcting these two aspects, we reevaluated the models and obtained the following updated results:
DCI Disentanglement: mean 0.8204, std 0.0291 Completeness: mean 0.57096, std 0.1079 Informativeness: mean 0.9646, std 0.0038
We would appreciate any further feedback or thoughts you might have on these observations.
Hello, After training the model for 200K steps, as mentioned in the paper, across two different runs, we were unable to reproduce the numbers reported in the paper for Clevr-Easy dataset. We are attaching the numbers obtained from both runs for your reference.
DCI Disentanglement: mean 0.503, std 0.0017 Completeness: mean 0.4482, std 0.0098 Informativeness: mean 0.9687, std 0.0015