After running 300K on TPU, we didn't achieve satisfactory results.

AaronAnima commented 5 years ago

We run **'launch_train_tpu_sagan.sh' on V3 pod for 300k steps, but it seems to collapse from 10k. And the generation quality is not good, i wonder wether you've managed to achieve or at least close to biggan's performance? Here're some results: After 100k 100k samples After 180k 180k samples After 300k 300k samples

zsdonghao commented 5 years ago

I got the same problem, it seems this code is not useful ..

AaronAnima commented 5 years ago

I didn't use 'comet' module, and keep 'launch_train_tpu_sagan.sh' almost the same( just do some small changes regarding to my local environment, those important configs such as bs, lr... remain the same)

davidhughhenrymack commented 5 years ago

Hi all,

The code didn’t get finished, hence you’re not seeing good results. Feel free to submit PRs to improve it :)

On October 15, 2019 at 8:35:13 PM, Mingdong Wu (notifications@github.com) wrote:

I didn't use 'comet' module, and keep 'launch_train_tpu_sagan.sh' almost the same( just do some small changes regarding to my local environment, those important configs such as bs, lr... remain the same)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Octavian-ai/BigGAN-TPU-TensorFlow/issues/6?email_source=notifications&email_token=ADSQKIUCGBXNW4UPW3IPUPTQO2DXDA5CNFSM4JBFUODKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBK5OIY#issuecomment-542496547, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSQKIU767RQAHMW7YNEYV3QO2DXDANCNFSM4JBFUODA .

gwern commented 4 years ago

We also can report failure; after swapping the initialization to tf.orthogonal_initializer(1.0) to make it more BigGAN-like, a 6h run on a TPU pod (~12k/s or >4.3m total) produced only blobs like attached:

test96-96-64-301

zsdonghao commented 4 years ago

this project just does not work

gwern commented 4 years ago

Well, we know that now. Unfortunately, the README doesn't mention that at all (a minor omission, to be sure).

Octavian-ai / BigGAN-TPU-TensorFlow

After running 300K on TPU, we didn't achieve satisfactory results. #6