analogdevicesinc / ai8x-synthesis

Quantization and Synthesis (Device Specific Code Generation) for ADI's MAX78000 and MAX78002 Edge AI Devices
Apache License 2.0
55 stars 50 forks source link

Early exit inference on MAX78000FTHR #336

Closed shouborno closed 3 months ago

shouborno commented 5 months ago

I noticed that it's possible to train some simple early exit models via ai8x-training, thanks to codes such as the following: https://github.com/analogdevicesinc/ai8x-training/blob/develop/train.py#L1534

However, I'm unsure how to synthesize and run early exit on the device. Could you please share an example? I'd like to see which layer of the model a test inference exits from on-device.

Jake-Carter commented 5 months ago

Hi @shouborno,

Early exit is available via the --stop-after command-line flag to the ai8x-synthesis tool. See this table in the readme.

You can add --stop-after N to stop the inference after layer N. The synthesized output will update its known answer test to also validate the model's intermediate result at that layer (see cnn_unload in the generated cnn.c file).

i.e. for mnist to early exit after layer 2

python ai8xize.py --test-dir sdk/Examples/MAX78000/CNN --prefix mnist --stop-after 2 --checkpoint-file trained/ai85-mnist-qat8-q.pth.tar --config-file networks/mnist-chw-ai85.yaml --softmax --device MAX78000 --timer 0 --display-checkpoint --verbose

I hope this helps answer your question - let me know if I can help clarify further

shouborno commented 5 months ago

Is dynamic inference taken into account here? The training repository has the earlyexit_thresholds parameter, but I don't see it in the synthesis repository. Shouldn't it exit after a layer if the confidence score or some other metric exceeds a certain threshold? If this isn't implemented out of the box in ai8x-synthesis and only static early exit is supported, can you please give me some pointers about how I can implement threshold-based dynamic exits for MAX78000?

rotx-eva commented 5 months ago

There is no linkage between early exit during training (inherited in the training script from Intel's Nirvana Distiller) and the synthesis feature referenced by Jake. MAX78002 (not MAX78000) has the ability in hardware to terminate inference based on values/ranges of layer results but currently there is no comprehensive example available.

shouborno commented 5 months ago

I see. Theoretically, is it possible for us to implement dynamic inference on MAX78000FTHR? If so, could you please give me some pointers? Also, what features of MAX78002 make it easier to do so on MAX78002 instead of MAX78000?

rotx-eva commented 5 months ago

On MAX78000 you would have to treat "early exit" as two models. Run the first half to completion, examine results in software, reload the register configuration, and run the second half of the model with the data that is already in the accelerator memory. You don't have to reload weights since even though the second half of the model is treated as a "new" model, you can specify where weights are located so they won't get overwritten. MAX78002 can do all this plus:

We published an application note about multiple models that may be helpful here: https://github.com/analogdevicesinc/MaximAI_Documentation/blob/main/MAX78002/Utilizing%20Multiple%20CNN%20Models.pdf

github-actions[bot] commented 4 months ago

This issue has been marked stale because it has been open for over 30 days with no activity. It will be closed automatically in 10 days unless a comment is added or the "Stale" label is removed.