tracel-ai / burn

Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.
https://burn.dev
Apache License 2.0
8.67k stars 430 forks source link

Graph mutation/introspection #781

Closed xd009642 closed 6 months ago

xd009642 commented 1 year ago

Feature description

Some ability to after loading a neural network inserting nodes in, or modifying it in some manner. I've spent a bunch of time trying to do this in torch (rust bindings only) and tensorflow (in rust and python) and not found myself able to satisfactorily implement it.

Feature motivation

Implementing papers such as Sun, Y., Huang, X., Kroening, D., Sharp, J., Hill, M., & Ashmore, R. (2019). Structural test coverage criteria for deep neural networks. this requires instrumenting each node and using the inputs/outputs to work out how much of the network is actually utilised. Also in large models there's often talk of internal sparsity so being able to introspect the inner workings of the network would be useful.

My thought of doing this is some sort of wrapper tensor that logs what happens to inner tensor to some shared data store, as well as building up a graphical representation of the network (something I can easily render via graphviz and couple with collected information).

Another thing which I've found numerous times at work, we want to re-add in training nodes to the graph and allow finetuning but then later remove them. Then tensorflow based solution involves bundling the training version of the network and the frozen graph together. Being able to do this with just a single graph and modifying it would be more expressive.

There is every chance that this is possible in tensorflow and torch and we've just missed it - tensorflow isn't particularly well documented (at all), and with torch while being much better in terms of docs this doesn't seem to be something people often want to do. I've omitted a suggested solution because I don't think I have enough understanding of burn's inner architecture to suggest one and more wanted thoughts/opinions. I'm aware there may be a performance impact from having this available all the time - so perhaps there would be an interpreted graph mode which enables this?

nathanielsimard commented 1 year ago

Graphs in Burn are fully dynamic; you can mutate them as you wish without any problems. The current training loop assumes that the model is read-only during the forward pass, but you can modify the module by overriding the optimize function of the trait TrainStep.

For introspection, you could write a backend decorator, similar to how autodiff is actually implemented, to collect metrics on tensors. It would act as a plug-and-play tensor wrapper!

xd009642 commented 1 year ago

Oh neat that's pretty cool, I'll try to find some time to have a play with this 🎉