Can we quantify what we mean by "small" neural networks?

PumasAI / SimpleChains.jl

Simple chains

MIT License

234 stars 14 forks source link

Like a 5 million parameter fully connected feed forward network has no speed gains while a 5000 parameter network does. I'm making up numbers here but I hope you get the gist.

The largest model I ever tried was LeNET*MNIST. At just over 44k parameters, it is a far cry from your 5 million, so I have no idea how it'd perform for a model that large. Benchmark results for MNIST were shared here, where it (at that size) still did substantially better than the competition on the CPU: https://julialang.org/blog/2022/04/simple-chains/ and was still competitive with Flux + a very beefy GPU.

But with 5 million parameters, you're almost certainly better off on a GPU, which isn't supported by SimpleChains at the moment.

If you want GPU support, I'd also suggest taking a look at Lux.jl: https://github.com/avik-pal/Lux.jl

PumasAI / SimpleChains.jl

Can we quantify what we mean by "small" neural networks? #76