AlignmentResearch / go_attack

MIT License
79 stars 7 forks source link

How to set up the adversarial training? #141

Open Dreamkeeper66666 opened 4 months ago

Dreamkeeper66666 commented 4 months ago

Hi, I just tried to have the cyclic-adv-s545 model play against the latest 28b model. However, it doesn't seem to work very well. And I would like to do some fine tuning on my own. I just saw some scripts under kubernetes folder, but I don't really know how to make it run locally, so is there any instructions on setting up the iterative adversarial training on a local machine? By the way are there some more recent models? Thanks! @AdamGleave @tomtseng @ed1d1a8d

tomtseng commented 1 month ago

Right, I don't expect cyclic-adv-s545m to work well against 28b since KataGo has been adversarially trained against cyclic positions from cyclic-adv-s545m, though I would guess that if you spent several hundred GPU-days fine-tuning that you could get an attack against 28b.

I added some more detailed instructions in the README now and uploaded adversaries to Google Drive. I should note the scale of these experiments is a bit high (several hundred GPU-days) so if you don't have access to several GPUs you might find it hard to make much training progress.