v923z / micropython-ulab

a numpy-like fast vector module for micropython, circuitpython, and their derivatives
https://micropython-ulab.readthedocs.io/en/latest
MIT License
429 stars 116 forks source link

Why not port from numpy? #160

Closed yang8621 closed 4 years ago

yang8621 commented 4 years ago

Hi

Is ulab created from scratch? Is it possible to (Why not) port from numpy by some modifications to adapt micropython? E.g. by removing the CPython dependencies.

Thank you.

v923z commented 4 years ago

Is ulab created from scratch?

Yes.

Is it possible to (Why not) port from numpy by some modifications to adapt micropython?

Most probably not.

  1. The most basic ndarray container would consume much more space, than is available on a microcontroller. If you strip all sub-modules of ulab, the code fits into around 20 kB. I would really be surprised, if numpy could be slimmed down so much.

  2. I don't know, how, using numpy, we could have the module granularity (sub-modules) that I mentioned above. This is a departure from numpy, but we had good reasons for that.

  3. In numpy, many things are actually implemented in python, and the interpreted python interface calls the underlying C code, if there is any. That might be OK on a computer, but if you have only 30 kB of free RAM, you have to think twice. Also, if a function is written in python, you don't gain speed.

  4. License: though numpy's BSD is probably OK, I wanted a library that is on par with micropython itself.

  5. numpy has altogether 13 data types. Most of these don't really make sense on a microcontroller (but consume a lot of space), and I don't know if there is an easy way of removing the useless data types.

E.g. by removing the CPython dependencies.

numpy depends on a number of packages, some of which are still in fortran. I don't even know, where we would have to start to compile for a small ARM processor.

So, all in all, I am not sure, whether there would be any advantages of bringing in numpy. I see that we wouldn't have to re-invent the wheel, but numpy is a huge, and very general library, and consistently weeding out the unnecessary components would be an equally huge undertaking.

The general comments in https://github.com/micropython/micropython/issues/6261 apply here, too.

yang8621 commented 4 years ago

Thank you for your response.

Regardless of the RAM limitation, I'm not sure if it's possible to support all 13 data types on micropython? And to support numpy then? For Fortran and Cython dependencies, maybe it should be rewritten at all. Maybe at the last, most numpy will have been rewritten. :(

v923z commented 4 years ago

Regardless of the RAM limitation, I'm not sure if it's possible to support all 13 data types on micropython?

The RAM issue can't just be dismissed. We are talking about microcontrollers with 100 kB of RAM. That has to hold the python interpreter, and numpy. This is not a trivial task. ulab has been reported to run on a one-dollar microcontroller: https://gitlab.com/rcolistete/micropython-samples/-/tree/master/Pyboard/Firmware/v1.12_with_ulab/ I would challenge you to pull that off with numpy.

And to support numpy then?

But why should we do that? I am trying to understand what in numpy you are missing. If there is a function that you would like to use, you can always raise an issue, and we can implement it.

One other problem is that most numpy functions take tons of arguments. They are simply too general, and that costs disc space: numpy is at least 35 MB compiled, and that already assumes that the library can use system resources that are not necessarily present on bare metal. https://towardsdatascience.com/how-to-shrink-numpy-scipy-pandas-and-matplotlib-for-your-data-product-4ec8d7e86ee4

To put it differently: ulab takes about 35 kB, which is 0.001 of numpy. Would you be willing to sift through numpy, and discard 0.999 of it, so that it fits on a microcontroller?

Moreover, ulab is not only numpy. We have implemented a number of useful functions from scipy. That is an extra 135 MB!

For Fortran and Cython dependencies, maybe it should be rewritten at all. Maybe at the last, most numpy will have been rewritten. :(

I strongly believe that in this particular case, starting from scratch and building from the bottom up is the only reasonable course of action.

yang8621 commented 4 years ago

That makes sense. Thank you for your response.