Open mratsim opened 7 years ago
Placeholder.
To avoid polluting this meta-thread with specific discussion on certain topics (say what I want in the random library), this will link to the discussion topics:
No issue open
For sampling from other distributions, there is Alea. I have to clean it up - some examples fail with the latest concept changes in devel - but I hope to make these work again soon
This almost makes me want to buy arewescientistsyet.org ala http://www.arewewebyet.org/. Perhaps you'd be interesting in creating something like this? :)
I would also add in differential equation solvers as well as Markov chain Monte Carlo samplers...
Over the last 2 months I've been working on high level bindings to the HDF5 library:
https://github.com/Vindaar/nimhdf5
It's still very much work in progress (also due to my limited knowledge of Nim and the more low level parts of HDF5). As a raw wrapper it should be fully functional, with the downside of the (imo not very intuitive) C API. But the high level bindings are improving slowly. There's an example (examples/h5_create_dataset_hl.nim) showing the available features.
Plotting
Status: no libraries
By far the most important category is missing from this list I feel; and that is first-class two way python bindings.
The ability of python to easily (relatively, for the time) interface with the then-dominant languages was pivotal in its adoption in scientific computing.
Id use a ton of nim from python right away if there was a clean, boiler plate free method of sending ndarrays back and forth between the two. Last time I checked there was not, and as much as i like nim I dont see it replacing my entire python ecosystem any day soon.
In particular, I would much rather use nim than cython or numba or any such half-baked language. Boost-python has the bindings figured out pretty well but then again I can rarely justify having to deal with C++.
But a system of bindings with the convenience of boost-python but without the C++ would massively expand the usability of nim for my (and I think its not just me) scientific programmers.
Also, starting out a project in nim would be a much better proposition if i had the reassurance I could always pop up a matplotlib debug figure without any hassle.
@EelcoHoogendoorn there are a few projects.
None of these projects is fully mature at this point, but this is definitely something doable
Of course it is doable; both Python and nim are Turing complete. But without having the time to put in the work to make these into feature complete mature solutions myself, it is what is stopping me from using nim at present.
The good news is that this should be a lot less work than reinventing matplotlib.
On May 2, 2018 15:29, "Andrea Ferretti" notifications@github.com wrote:
@EelcoHoogendoorn https://github.com/EelcoHoogendoorn there are a few projects.
- nim-pymod https://github.com/jboy/nim-pymod is not mantained and a little cumbersome in that it requires its own scripts to build, but it allows to send ndarrays back and forth
- nimpy https://github.com/yglukhov/nimpy looks more actively mantained but I am not sure whether it supports Numpy types
- python3 https://github.com/matkuki/python3 seems to be another one, but I am not sure of its status
None of these projects is fully mature at this point, but this is definitely something doable
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nim-lang/needed-libraries/issues/77#issuecomment-385977811, or mute the thread https://github.com/notifications/unsubscribe-auth/ABt1BZQX3jCaLkItgxJvCC2tRNjxO9Tbks5tubTPgaJpZM4Qh_O5 .
I think most active nim users are aware of this by now, but there's a functioning plotting library here: https://github.com/brentp/nim-plotly
since it serializes to json and uses plotly.js to plot (but it works for the C backend), it will have a limited number of points, but when using webGL it can plot ~200K points in my browser and still be tolerably responsive.
Hi brentp;
Thats looking pretty cool indeed! Note that I am not trying to take a jab at plotting in nim specifically, but trying to make a point about the relative size of the ecosystem of python and nim generally; plotting is just an example.
I think itd be foolish to expect nim to be able to compete with python anytime soon on that front; making sure we have first-class two-way interop between the two sounds like it might happen a decade sooner at least.
And finally we can do non-linear least square fitting in Nim :)
Finally spent some time to make the interface for my NLopt wrapper nicer and create a PR for nimble for it. So if non-linear least square fitting isn't for you, maybe general nonlinear optimization is. ;)
For some precision engineering/scientific applications, the ability to use arbitrary precision floating point arithmetic would be useful. Does an MPFR wrapper a la Julia's built-in support for BigFloat belong on this list?
@abudden Certainly.
it seems that there is still no computer algebra system
module like https://www.sympy.org/. I also made a post https://forum.nim-lang.org/t/4165
a decent stats package would be a huge boon for my work. Even if it started with t-test and anova.
https://github.com/fragcolor-xyz/nimtorch
Full pytorch for nim, for you.
Do we want a category for natural language processing? Examples of Python libraries are nltk, gensim, spacy, and scikit-learn.
Also, how about mathematical optimization - like scipy.optimize for example, and how about signal processing - like scipy.signal?
@ihendley I think so, yes.
What about simulation? Something like simulink, modelica or Modia (in Julia).
It would be nice something similar to Modia in particular, given Nim's metaprogramming capabilities.
One area where I believe nim could shine is in exporting FMU model (following the FMI standard). I don't see python doing that. An even for Julia is a struggle because they need to export the runtime for compiled stuff which is big and not straightforward (here you can see how the libraries take above 100Mb for a simple example, when compiled ahead of time).
FMI Code Generator FMU SDK Sundials: SUite of Nonlinear and DIfferential/ALgebraic Equation Solvers in order to embed the solver in the FMU. Bindings for this would be useful even on itself. SimulatorToFMU
It's been a while since I updated the original post but it's done :)
having a (nearly?) fully functional jupyter kernel would be quite useful for my work and, I suspect for many people.
having a (nearly?) fully functional jupyter kernel would be quite useful for my work and, I suspect for many people.
@brentp: There is (or was) jupyternim
: https://github.com/stisa/jupyternim
I'm not sure if it's abandoned and/or still compiles (last activity Oct 2018); I have never used it. Its downside is that it was written without hot code reloading in mind of course. However, I think it'd provide a nice basis for an updated implementation, which uses HCR for the relevant parts and the socket communication of jupyternim
.
I once started playing around with HCR, but wasn't very successful even implementing a trivial repl, https://github.com/vindaar/brokenrepl. Posting it here if anyone wants to give it a try.
yes, I saw that and inim from @stisa, now that there are ggplots and dataframes, the notebook would a be a boon.
(my) jupyternim and inim are the same code, there was a naming conflict with https://github.com/AndreiRegiani/INim so I renamed it. I agree it's due an update, but I have been pretty busy this year.
Last time I saw, HCR was limited to JS target, looking at https://nim-lang.org/docs/hcr.html there was a lot of progress so I may have a look into adopting it when I get some free time, if nobody starts working on it first.
@mratsim, @brentp, @HugoGranstrom and me chatted recently about trying to unify the science related code a little more. While we didn't decide anything specific yet, we talked about creating an organization to hold related repositories in the future:
I only invited a few people that from the top of my head use Nim for science related stuff. If you want join, feel free to message me or just join the gitter channel here:
https://gitter.im/SciNim/community
and say hi.
I played during easter about creating a web based on Hugo for this purpose. I am happy to provide it to you.
I have uploaded it here: https://mantielero.github.io/nim4science/
Feel free to use it.
I've just released a pure Nim fixed point number library here
I started working on a geometry (mainly focus on GIS and CAD) library, but it is not yet presentable :)
My linear algebra package: https://github.com/planetis-m/manu is still in development and I am happy accept contributions.
This is a meta-issue to keep track of discussion around Nim scientific libraries.
Primitive libraries
Decimal128: https://github.com/JohnAD/decimal128 Fixed-point: https://gitlab.com/lbartoletti/fpn
Multidimensional arrays, Linear-algebra
Multidimensional arrays are the basic block of scientific computing, it goes beyond the 2D or 3D vectors and matrices. Notable non-Nim implementations include Fortran, Julia, Matlab and Numpy.
Status: in-progress Libraries:
Support
Arraymancer supports dense multidimensional arrays of any type, on CPU (integers, floats, complex), Cuda and OpenCL (float only) and uses BLAS, CuBLAS and Clblast under the hood.
Flambeau is provide libtorch bindings and reproduces PyTorch functionality.
Manu is a pure Nim matrix library with no external dependencies
Neo supports dense and sparse float vectors and matrices, on CPU and Cuda (Nvidia GPUs) and also uses BLAS and LAPACK under the hood.
Status: stalled Libraries:
NimTorch supports most PyTorch features regarding multidimensional arrays, on CPU, Cuda, OpenCL and AMD ROCm provided you compiled PyTorch's Aten backend with the relevant features.
Plotting
Data analysis requires plotting, notable non-Nim implementations include Python matplotlib and seaborn, Plot.ly (Python, R, Javascript), R ggplot2, Matlab and Facebook Visdom (a simple interface to Plot.ly).
Note that there are a couple of approach to plotting, either having a charting library or having a high-level grammar library (similar to SQL) that hides low-level details of a chart.
Status: in-progress Libraries:
Proof-of-concepts:
Unmaintained:
ggplotnim is an implementation in pure Nim of the graphics of grammar. gnuplot.nim is a wrapper of gnuplot. Nim-Plotly uses the plot.ly charting library as a backend. Both MetaPlot and Monocle uses the Vega visualization grammar.
Image processing library
Computer vision is a thriving area of research. Vision scientists needs algorithms that works on images represented as a multidimensional arrays (different from say Photoshop), preferably multithreaded and GPU accelerated.
Notable non-Nim libraries include OpenCV, Matlab, Python scikit-image, scipy.ndimage and mahotas.
Status: in-progress
Libraries:
Unmaintained:
Nim-opencv provides rough low-level bindings of OpenCV functions.
Dataframe and columnar/tabular data processing
Dataframes are essential to process structured data (say Name, Age, number of products bought, last time of visit). They allow very efficient data manipulation, including easily creating new columns, joining dataframes, converting between types.
Notable non-Nim packages include Python Pandas and R datatable. When data does not fit in RAM, dataframe packages are interfaced with SQL or HDF5 datastores or even Spark for very large scale processing.
Status: in-progress Libaries:
Random library
Lots of scientific algorithms rely on stochastic processes or random distribution. At the very least pseudo-random generator that samples from a normal/gaussian distribution is needed.
Notable non-Nim library include Scipy
Status: in-progress Libraries:
Statistics library
Notable language: R
Status: standard lib statistics module
Machine learning
Machine learning is how to teach a computer to learn/generalize patterns from data.
Notable non-Nim libraries include: Python's Scikit-Learn and R's Caret. State-of-the-art C++ library to wrap: XGBoost
Status: in-progress
Deep learning & neural network.
Deep learning is machine learning with neural networks and arguably eating the world (or atleast Reddit, Hacker News and sponsors). In comparison to most traditional machine learning tools, neural networks can also learn very well from non-structured data (images, sounds, text ...).
Notable non-Nim libraries include: Facebook Torch, Google Tensorflow, Apache and Amazon Mxnet
Status: in-progress Libraries:
Proof-of-concept:
Non-linear optimization
Status: in-progress Libraries:
Linear programming
Status: in-progress Libraries:
Computational Physics
Status: in-progress Libraries:
Geometry
Computational geometry also require tuned algorithms for: geometry primitives, polygons and polyhedron, triangulations, distances, shape analysis ...
Noteable non-Nim library: CGAL
Status: no library
Scientific serialization format
There are many formats specific to science ot even science domains.
Libraries:
Geospatial library
Often scientist needs to deal with geospatial coordinate (latitude, longitude), maps and distances. This include efficient data-structures like KD-Tree or RTree to compute distances between points and distance formulas like Haversine to compute distance on a sphere.
Notable non-Nim libraries include Python's scipy.spatial, Geopy, Shapely
Status: in-progress R-tree forum thread.
Proof-of-concepts:
Scientific language bindings
Python:
Unmaintained