NNPDF / pineappl

PineAPPL is not an extension of APPLgrid
https://nnpdf.github.io/pineappl/
GNU General Public License v3.0
12 stars 3 forks source link

Lists to numpy arrays in python interface #106

Closed alecandido closed 2 years ago

alecandido commented 2 years ago

This should close #82

I don't believe there is an automatic list to array conversion, so we might need to update also some other code after this (hopefully most is done in the pure python layer).

For this reason, I'm not using arrays where lists are expected. E.g. list of tuples won't be generate as 2 dimensional arrays in most cases: https://github.com/N3PDF/pineappl/blob/3edf25ee8540a0141bb43425d47a24939fa21634/pineappl_py/src/grid.rs#L275 or lists of objects: https://github.com/N3PDF/pineappl/blob/3edf25ee8540a0141bb43425d47a24939fa21634/pineappl_py/src/grid.rs#L69

I broke backward compatibility to pass an actual 5 dimensional array as operator, instead of splitting it in flattened + shape.

Most likely we want other updates before merging:

codecov[bot] commented 2 years ago

Codecov Report

Merging #106 (1d95f6c) into master (d0d7bab) will decrease coverage by 0.03%. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #106      +/-   ##
==========================================
- Coverage   88.46%   88.43%   -0.04%     
==========================================
  Files          31       31              
  Lines        3070     3070              
==========================================
- Hits         2716     2715       -1     
- Misses        354      355       +1     
Impacted Files Coverage Δ
pineappl_cli/src/plot.rs 92.63% <0.00%> (-0.53%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d0d7bab...1d95f6c. Read the comment docs.

alecandido commented 2 years ago

Current version should be breaking even tests (even though we have a limited amount of them), and for sure it will break some usage (since we now require numpy arrays everywhere in the input).

We can decide to wrap all functions in the pure python layer, in order to accept both lists and numpy arrays (np.array is idempotent), but I don't know if it's worth. What do you think about this @cschwan?

In any case, pineappl (and especially interface) is known to be unstable, so it should be fine anyhow.

alecandido commented 2 years ago

P.S.: if anyone (i.e. @cschwan or @felixhekhorn) wants to start a review, start grepping for Vec in pineappl_py/src, and consider if what is still left behind is find to keep as a Vec (there are some cases in which numpy array conversion is possible, like lumi functions, but not being a true linear algebra object I preferred to keep as a Vec)

cschwan commented 2 years ago

We can decide to wrap all functions in the pure python layer, in order to accept both lists and numpy arrays (np.array is idempotent), but I don't know if it's worth. What do you think about this @cschwan?

Since you're the sole user of the Python interface for the time being, you should decide this. What's more pythonic?

cschwan commented 2 years ago

Current version should be breaking even tests (even though we have a limited amount of them), and for sure it will break some usage (since we now require numpy arrays everywhere in the input).

Are the Python tests being run in the CI?

alecandido commented 2 years ago

Are the Python tests being run in the CI?

No (even though it's installing python for an unknown reason).

  1. should I add to the same workflow?
  2. I'd propose to add a workflow running the examples, to keep checking them (at least while releasing); citing rust book

Nothing is better than documentation with examples. But nothing is worse than examples that don’t work because the code has changed since the documentation was written.

(section 14.2)

P.S.: do we want to do 1. in this PR or in another one?

alecandido commented 2 years ago

Since you're the sole user of the Python interface for the time being, you should decide this. What's more pythonic?

Python has no strict type system (only type hints for static analyzers), so the pythonic way is duck-typing: if you need a feature on your input (comparison, arithmetic, ...) check for the feature, not for the type. So everything that can work should do it (but there is the maintainability price, since every time you write a function you'll do it twice, even though it's a thin layer).

cschwan commented 2 years ago

P.S.: do we want to do 1. in this PR or in another one?

I'll let you decide this, I don't have strong opinion about either one.

alecandido commented 2 years ago

Done in 9d2a603. Ready to merge.

cschwan commented 2 years ago

@AleCandido Thanks for working on this Issue!

felixhekhorn commented 2 years ago

With the numpy-fication happening in the interface this is still backward compatible, correct?

alecandido commented 2 years ago

With the numpy-fication happening in the interface this is still backward compatible, correct?

Yes, that was half of the purpose. The other half being to always support structures that contains the same information.

felixhekhorn commented 2 years ago

Re casting: https://github.com/N3PDF/pineappl/commit/9b4d75bcc900cd6e735bf451a4d21f81f18b2481

alecandido commented 2 years ago

Still in the same direction: float - > int - > uint 🙃

felixhekhorn commented 2 years ago

Well, as said you need the explicit cast

In [2]: np.array([1,2,3]).dtype
Out[2]: dtype('int64')

In [3]: np.array([1,2,3],dtype=np.uint).dtype
Out[3]: dtype('uint64')

otherwise rust is complaining (and not doing anything by the way)

felixhekhorn commented 2 years ago

I believe another side-effect (slightly breaking backward compatibility) is https://github.com/NNPDF/fktables/commit/7ab2fe663090a0435734e0f0d8ffd0415854f8e8

Unfortunately I don't have the error message still at hand, but it was saying smth like "output array is read-only" ...