issues
search
carbonscott
/
maxie
Masked Autoencoder for X-ray Image Encoding (MAXIE)
Other
1
stars
4
forks
source link
Training code optimization
#18
Open
carbonscott
opened
3 weeks ago
carbonscott
commented
3 weeks ago
[x] Correct the iteration/step count
[x] Enable activation checkpointing.
[x] Use TF32 (Ampere arch)
[x] Enable cudnn?
[x] Visualization input and output once a while.
[x] Make grad accum right!!
[x] Does the model overfit a really small dataset (a single batch)?
[ ] Init parameters with std = 1 / sqrt(fan_in), does std grow due to residual blocks?
[ ] Are CUDA-related constants divisible by the power of 2? (CUDA is designed this way, lol.)
[x] Report grad norm
[ ] Improve the loading mechanism when batch size one (detector image) is still too big?
[ ] Implement GAN loss