MLSys'21 | sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data

Background

Class-specific neuron analysis

unit ablation on a fully trained CNN model will only decrease inference accuracy on certain class;

e.g., decouple a 10-way NN to 10 binary classifiers.

So I think this only works for classification-based NN architecture;

Pruning

One-Vs-All (OVA) reduction

OVA machine learning model reduction is a general approach that reduces a multi-class learning problem into a bunch of simpler problems solvable with binary classifiers.

Unlike unit ablation, OVA is to train multiple binary classifiers with pre-defined structures. (not converting a big model into small binary ones).

Big picture of sensAI

Get N binary classifiers from an N-way big model via pruning (one-shot / iterative);
Retrain the binary classifier to regain accuracy;
Combining results back to N-way prediction: Add a soft-max layer; (I think this is not necessary and might be problematic... since different activation numeric range might differ by class...)

Evaluation

Dataset: CIFAR10 Well, it is not enough but it is fine as preliminary results.

Looks good!

ganler / ResearchReading

MLSys'21 | sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data #48