Closed RorschachChen closed 1 year ago
Hi @RorschachChen! Thank you for bringing the lack of code to our attention, we will post it shortly! I will update you when it is uploaded. The code is a bit tricky to get running so I am happy to help you get started if you run into trouble.
@RorschachChen I have uploaded the code in the state it was run for the paper. This will not be easy to run yourself, because I did things in a manual or nonportable way, but I will work on improving the code so it works out of the box.
The main algorithm is in kernel_reverse.py
, where the optimization of the poison data takes place.
In general the workflow would be:
train_fast.py
kernel_reverse.py
train_fast.py
The challenges to a simple reproduction are:
We are also lacking good documentation for the time being.
Thanks for your reply, you have done a talented work. I have a question about resource, I have about 4~8 V100s with 32G VRAM for each, it is runnable for the code and how many run time will this code cost approximately?
Unfortunately I've never used V100s so I don't know what kind of performance they have. One thing that is important is double precision FLOPS, which some GPUs (e.g. A40) have a huge performance penalty (~64x) compared to single precision. I see that V100 has a 32:64 perf ratio of about 2:1 so at least it shouldn't be too bad.
The other important thing is to have large memory so you can fit a bigger tile size, because computing each n * n
effective scales linearly in n
, limited only by how many gradients you can fit in memory.
Fortunately, it's easy to test what kind of performance you will get. For example for the 5x WideResNet, you can construct the kernel as in here and then just run it with a small batch of dummy data and see what kinds of timings you get, then extrapolate to however many GPUs you have and the size of your dataset.
Thanks.