pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
84.64k stars 22.79k forks source link

Quadric Layer #105365

Open diro5t opened 1 year ago

diro5t commented 1 year ago

🚀 The feature, motivation and pitch

Introducing a layer with second order (quadric) hypersurface separability which in turn reduces model size significantly at the same performance on a high level before even utilizing sparsity/quantization. This approach can be used everywhere as a drop-in for a Linear layer but with substantially reduced size.

This paradigm is based on my research: https://www.researchgate.net/publication/221582251_Using_Quadratic_Perceptrons_to_Reduce_Interconnection_Density_in_Multilayer_Neural_Networks

There is also other research about higher order neurons in the field, although later afaik

I have further explained the paradigm in my GitHub repo : https://github.com/diro5t/deep_quadric_learning

In this repo there are further examples of reducing model size in concrete applications for a singular quadric neuron as well as for quadric layers demonstrated for the MNIST dataset in PyTorch as well as in tinygrad.

The proposed implementation can be seen in my fork https://github.com/diro5t/pytorch in the torch.nn.modules.linear.py.

This feature is also on the PyTorch 2.1 feature list https://docs.google.com/spreadsheets/d/1TzGkWuUMF1yTe88adz1dt2mzbIsZLd3PBasy588VWgk/edit#gid=2032684683

Alternatives

No response

Additional context

comment regarding this feature: @dirk.roeckmann@fivetroop.com : Please first create an issue (feature request, here: https://github.com/pytorch/pytorch/issues/new/choose) against pytorch/pytorch. With this feature description. This issue needs to be accepted by pytorch maintainers in order to be considered. cc @albandes@meta.com @nshulga@meta.com

cc @albanD @mruberry @jbschlosser @walterddr @mikaylagawarecki

mikaylagawarecki commented 1 year ago

Hello @diro5t, thanks for the detailed series of notebooks on quadric neurons in the linked repo as well as the clear definition of the Quadric layer that you linked in your repo!

PyTorch Core has some pretty strict rules when it comes to adding a new Module function or feature, see here

Currently, this layer seems best suited to be implemented in a separate library, and for the future, we'll happily reconsider if this function increases in popularity. We will leave this issue open to gauge popularity among other researchers and users.

Separately, re: the feature request for 2.1 that you filed, we use feature requests for features that will get a callout in the release so this issue should be sufficient to track this request :)

diro5t commented 1 year ago

Thanks for the update--best regardsDirk RoeckmannFive Troop Consulting Services @. Jul 25, 2023, at 3:48 PM, mikaylagawarecki @.> wrote: Hello @diro5t, thanks for the detailed series of notebooks on quadric neurons in the linked repo as well as the clear definition of the Quadric layer that you linked in your repo! PyTorch Core has some pretty strict rules when it comes to adding a new Module function or feature, see here Currently, this layer seems best suited to be implemented in a separate library, and for the future, we'll happily reconsider if this function increases in popularity. We will leave this issue open to gauge popularity among other researchers and users. Separately, re: the feature request for 2.1 that you filed, we use feature requests for features that will get a callout in the release so this issue should be sufficient to track this request :)

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

diro5t commented 1 year ago

How can this feature gain popularity if it is neither offered as a prototype feature nor as a tutorial?

mikaylagawarecki commented 1 year ago

Generally, user activity on issues (like this one!) such as comments discussing the topic and emoji reactions help to gauge the demand for the feature

diro5t commented 1 year ago

thanks, got it

diro5t commented 1 year ago

I hope many interested users start to use it as drop-in for nn.Linear and appreciate high-level significant model size reduction without deterioration of performance even before applying quantization, pruning, sparsity etc.