AlignmentResearch / go_attack

MIT License
81 stars 7 forks source link

Can you reupload vit-victim-b16-s650m bin.gz file to the drive? #153

Open SodiumJu opened 3 months ago

SodiumJu commented 3 months ago

Hi, I have noticed that the model format of vit-victim-b16-s650m in your Google Drive is not a bin.gz file. I would like to use your ViT-adversary to attack our Transformer based models. However, to use your ViT-adversary, it need to take a victim model in bin.gz format, so I am wondering if you could upload your ictim model in bin.gz format or give the instructions of how to convert your pt file into right bin.gz format. Thank you very much. @AdamGleave @tomtseng @ed1d1a8d

tomtseng commented 3 months ago

Hey, thanks for your interest—unfortunately we don't have vit-victim-b16-s650m in bin.gz format.

The context:

The .ckpt file in the Google Drive can be torch.load()ed in PyTorch if you just want to look at the weights.

If you can't get the .pt file working (which requires using our fork of KataGo), possibly you can use a different victim model with the ViT-adversary and hope it's still good at launching its attack. control-b10 in the Google Drive is a good candidate for this — it's a small CNN that was trained with the same settings as the ViT, and it's also vulnerable to the ViT-adversary (Figure F.8 in our preprint shows ViT-adversary (with control-b10 as the victim model) beating control-b10 at high visits)

SodiumJu commented 3 months ago

Thank you very much for the detailed explanation. I’m interested in running TorchScript models in C++ as you mentioned. Could you let me know if there is any existing code related to this in the vit-experimental fork of KataGo? Thanks again for your help.

tomtseng commented 3 months ago

The stable branch on KataGo-custom should be able to run the TorchScript models if you're compiling KataGo with the CUDA backend. In the CUDA backend we added a hack where if the model is a ".pt" file then it'll load the model as a TorchScript file, so in selfplay, match, or gtp you should just be able to set the model to a .pt file (and you'll also need to set KataGo configs useFP16=true, useNHWC=false, and inputsUseNHWC=false for the model) and then it should load the model.