Open johndpope opened 5 months ago
@fenghe12 / @JaLnYn / @ChenyangWang95
this might actually work.
In megaportaits I use custom resnet50 probably safer to switch that in because otherwise the model is going to just discard the updates???. I check in the morning.
@johndpope is it just me or sonnet 3.5 machine learning code output is actually way more readable than opus ? feels like actual working code this time !!!
something maybe not quite right. i train overnight. this is still epoch 0 -
i change the code back to use 512x512 resume training - and get this.
im seeing newer clearer images advancing in epoch 1 - even after a few more cycles - will udpate here later. i think by epoch 4 - probably going to be fairly decent.
i add some tensorboard stuff - and surface the losses.
UPDATE - my bad was overfitting to one image. I just push updated dataloader. new debug image.
Starting training again. was seeing OOM errors - check your num_of_workers.
UPDATE - i restart training - I change the generator to use resblocks - maybe will help recreate the image better.
UPDATE - Sunday so i rebuilt code to do progressive training with resolution upscaling - 64,128....256 ...512 added tensorboard losses
i give up training across celebA - i overfit to one pair of images....
training progress so far
UPDATE - Sunday night
so had some battle with gradient explosions
ending up having to add some accummulation steps in that helped stablize things
looks like the learning rate is getting things into a minima....
UPDATE - i switch to use 256 because resnet50 cant return rich features 2048,7,7 for images less than 224x224.
i had to rework the generator to use less layers / and use 64 x 64 image resizing.