Radioactive data: tracing through training

Abstract

Neural classifiers can improve their performance by training on more data
But given a trained classifier, it's difficult to tell what data it was trained on
This is especially relevant if you have proprietary or personal data and you want to make sure that other people don't use it to train their models
Train CNN with Vanilla data first, then train the CNN with Radioactive Data (i.e. Distorted images)

Basically, this paper introduces a method to mark a dataset with a hidden "radioactive" tag, such that any resulting classifier will clearly exhibit this tag, which can be detected.

Details

When you radioactively mark the data points, it simply adds a feature
- Let's assume that there are 10 classes for our problem
  - We can imagine that there are 10 vectors (each vector is a unique axis, and it depicts a corresponding class)
  - The classifier plots the point on that data space, and find the class that is aligned most
- 다른 예로, 이미지 분류 문제에서 4개의 클래스가 있다고 가정하자.
  - 이 경우, 분류기는 총 4개의 축 벡터 (w, x, y, z)를 가진다.
  - 정확히 말하자면 축이 아니라 학습을 통해서 배운 벡터이다.
  - 이 분류기가 데이터를 분류할 때에는 이 데이터 공간 위에 점을 찍은 뒤, 4개의 벡터 중 가장 거리가 가까운 벡터의 클래스로 분류
So, here, we are introducing a fake class vector
- Clearly, this is cheating!
By using this method, we modify the training data only
We will give a little bit of generalization capability, but this will force to pay attention to the fake features (radioactive)
- This is something that you could detect
For testing, they create random vectors on the augmented dataspace and look up the cosine value between fake vector and each random vector
- Authors of this paper stated that if you distort the data well, then theoretically the distribution of the cosine between fake vector and random vectors should follow the given distribution

The paper also shows the methods for re-aligning the feature spaces

Personal Thoughts

Clearly, data is the modern gold. Neural classifiers can improve their performance by training on more data, but given a trained classifier, it's difficult to tell what data it was trained on. This is especially relevant if you have proprietary or personal data and you want to make sure that other people don't use it to train their models. This paper introduces a method to mark a dataset with a hidden "radioactive" tag, such that any resulting classifier will clearly exhibit this tag, which can be detected.

YeonwooSung / ai_book