BIDS-numpy / numpy-paper

Draft of NumPy paper
10 stars 12 forks source link

List of "important" application specific libraries #70

Closed jarrodmillman closed 4 years ago

jarrodmillman commented 4 years ago

@rgommers @seberg @rossbar @pv @bashtage @tylerjereddy

Stefan and I are still working on Fig. 2 (Scientific Python Ecosystem) and have specific ideas about how to improve the "blue smudge." So just ignore that part of the figure for now.

However, we would like some suggestions for the application specific box, which is meant to include a long list of such projects. The idea being not to call attention to a few specific ones, but to indicate that there is a massive amount of important application specific libraries building on top of the lower levels of the ecosystem. These libraries don't need to have tons of stars, forks, and contributors. But they should be important in some sense. E.g., a one person project that was used for a major scientific discovery, a community project used by many people in a specific scientific subfield, or a new library written for an important new scientific collaboration could all make sense here.

It would be nice to have some diversity in fields covered. You may want to keep in mind the following text from the summary:

It plays an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, material science, engineering, finance, and economics. For example, in astronomy, NumPy was an important part of the software stack used in the discovery of gravitational waves \cite{abbott2016observation} and the first imaging of a black hole \cite{eht-imaging}.

If you have any suggestions, please comment below. Stefan and I will choose the ones we like best and update the figure. Also it would be helpful if you could point out ones that should be removed.

I made up the current list by using what I knew, reviewing old SciPy proceedings, using GitHub topics filtered by stars, and some web searching:

rgommers commented 4 years ago

Current version of the figure I'm reviewing:

image

It looks quite good. About included packages: the lower layers LGTM, the domain/application specific ones could be tweaked.

Domain specific

Alternative suggestion:

Application-specific

This box is a little random, some of these could be technique-specific instead.

Suggestions of well-known packages for the Application-specific box:

Candidates for removal:

Packages we're leaving out (so far):

jarrodmillman commented 4 years ago

Here is a slightly different version (I haven't incorporated your feedback yet): ecosystem

bashtage commented 4 years ago

PySAL is another project that has a good reputation and a reasonable number of stars:

https://github.com/pysal/pysal

Should tensorflow in Technique Specific? Or is the relationship with NumPy too arms length?

rossbar commented 4 years ago

I'm a big fan of this graphic! I think it's definitely an improvement over the "inverted blue pyramid" version

I mentioned this in the NumPy community meeting today, but I agree with Ralf wrt Scikit-HEP. I had previously looked into attempting to use it (along with other open-source tools) to try to re-create the famous Higgs "bump" in Python. From my limited experience, Python does not (yet) play a central role in the collection/analysis/simulation of HEP data like those from the LHC.

+1 for QuantEcon and Qiime2, both of which have tons of great educational material and are contributing to the ExecutableBookProject

There's also pyNE, the python library for nuclear enginering, for the domain-specific category; though nuclear engineering is not a huge discipline.

bashtage commented 4 years ago

It is unfortunate there is no easy way to sort the github dependency data:

https://github.com/numpy/numpy/network/dependents?dependent_type=PACKAGE

Maybe someone could scrape it?

bashtage commented 4 years ago

Probably not big enough (150*, ~30 contribs) but a good example of a DS application:

https://github.com/econ-ark/HARK

Also NumFocused.