SewoongLab / ntk-backdoor

1 stars 0 forks source link

Hello, just kindly asking when the code could be released? #1

Closed RorschachChen closed 12 months ago

RorschachChen commented 1 year ago

Thanks.

jhayase commented 1 year ago

Hi @RorschachChen! Thank you for bringing the lack of code to our attention, we will post it shortly! I will update you when it is uploaded. The code is a bit tricky to get running so I am happy to help you get started if you run into trouble.

jhayase commented 1 year ago

@RorschachChen I have uploaded the code in the state it was run for the paper. This will not be easy to run yourself, because I did things in a manual or nonportable way, but I will work on improving the code so it works out of the box.

The main algorithm is in kernel_reverse.py, where the optimization of the poison data takes place.

In general the workflow would be:

  1. Train a model on clean data using train_fast.py
  2. Compute the empirical NTK matrices. (Not uploaded yet)
  3. Run the attack in kernel_reverse.py
  4. Train a model on the clean data + poison data again using train_fast.py

The challenges to a simple reproduction are:

  1. The computation of the kernel matrices requires substantial resources. I originally wrote a script to distribute the computation on the UW cluster, but I don't think that script will be of much use to other people because it's not portable. It turns out it's possible to compute the matrices very quickly on a single node with A100s because they have very fast double-precision performance and a very large amount of VRAM which means they can run at larger batch sizes, but I don't have any script to do this.
  2. My general workflow was to just edit the files to run different experiments (bad idea I know...), so one needs to read the source and figure out what configuration to apply to make everything consistent. I tried to make this easier but it is not as good as a config-based system.

We are also lacking good documentation for the time being.

RorschachChen commented 1 year ago

Thanks for your reply, you have done a talented work. I have a question about resource, I have about 4~8 V100s with 32G VRAM for each, it is runnable for the code and how many run time will this code cost approximately?

jhayase commented 12 months ago

Unfortunately I've never used V100s so I don't know what kind of performance they have. One thing that is important is double precision FLOPS, which some GPUs (e.g. A40) have a huge performance penalty (~64x) compared to single precision. I see that V100 has a 32:64 perf ratio of about 2:1 so at least it shouldn't be too bad.

The other important thing is to have large memory so you can fit a bigger tile size, because computing each n * n effective scales linearly in n, limited only by how many gradients you can fit in memory.

Fortunately, it's easy to test what kind of performance you will get. For example for the 5x WideResNet, you can construct the kernel as in here and then just run it with a small batch of dummy data and see what kinds of timings you get, then extrapolate to however many GPUs you have and the size of your dataset.