root-project / root

The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
https://root.cern
Other
2.72k stars 1.3k forks source link

complex numbers in RDataFrame (PyROOT) #10522

Open ianna opened 2 years ago

ianna commented 2 years ago
etejedor commented 2 years ago

A non-RDF reproducer would be:

>>> import ROOT
>>> ROOT.std.vector['std::complex<double'].value_type
'std::complex<double>'
>>> ROOT.std.vector['_Complex double'].value_type
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: <class cppyy.gbl.std.vector<_Complex double> at 0x3938048> has no attribute 'value_type'. Full details:
  type object 'vector<_Complex double>' has no attribute 'value_type'
  'std::vector<_Complex double>::value_type' is not a known C++ class
  'value_type' is not a known C++ template
  'value_type' is not a known C++ enum

Our pythonization of std::vector relies on value_type to be always present, which is not the case currently as we can see in the example above.

If I try to do the same in C++ (@Axel-Naumann ?):

root [0] auto t1 = std::vector<std::complex<double>>::value_type();
root [1] t1
(std::complex<double> &) @0x7f69aae08010
root [2] auto t2 = std::vector<_Complex double>::value_type();
error: call to 'setValueNoAlloc' is ambiguous
/home/etejedor/root/fork/build/etc/cling/Interpreter/RuntimeUniverse.h:75:12: note: candidate function
      void setValueNoAlloc(void* vpI, void* vpV, void* vpQT, char vpOn,
           ^
/home/etejedor/root/fork/build/etc/cling/Interpreter/RuntimeUniverse.h:88:12: note: candidate function
      void setValueNoAlloc(void* vpI, void* vpV, void* vpQT, char vpOn,
           ^
/home/etejedor/root/fork/build/etc/cling/Interpreter/RuntimeUniverse.h:102:12: note: candidate function
      void setValueNoAlloc(void* vpI, void* vpV, void* vpQT, char vpOn,
           ^
/home/etejedor/root/fork/build/etc/cling/Interpreter/RuntimeUniverse.h:117:12: note: candidate function
      void setValueNoAlloc(void* vpI, void* vpV, void* vpQT, char vpOn,
           ^

A workaround would be not to use value_type from our std::vector pythonization and just check the element type of the vector via the name of the class.

Axel-Naumann commented 2 years ago

Where does _Complex double come from as a column type? It's a C feature that's rarely used in C++...

ianna commented 2 years ago

Where does _Complex double come from as a column type? It's a C feature that's rarely used in C++...

from the following definition of an RDF column:

data_frame_xy = data_frame.Define("y", "x*2 +1j")
Axel-Naumann commented 2 years ago

Thanks - so j is C's I?

We can certainly add support for C's _Complex types. That's a new feature that nobody ever missed so far :-)

Alternatively and at least as a temporary workaround, maybe go C++?

root [0] #include <complex>
root [1] auto c = 1.0 + 1i;
root [2] c
(std::complex<double> &) @0x10dcbd070

(This might need a using namespace std::complex_literals - I don't actually know why it works without on my macbook!)

How important is the support of _Complex for you / your usecase?

ianna commented 2 years ago

@Axel-Naumann - Thanks! It would be nice - for consistency. 'j' is Python definition of imaginary part. When I try to pass 'i' it does convert it to:

cpp_ref    = (1.7223964231088758+1j)
Axel-Naumann commented 2 years ago

But IIUC j in data_frame.Define("y", "x*2 +1j") is C++, so where does that come from? Is that a pythonization, @etejedor ?

etejedor commented 2 years ago

But IIUC j in data_frame.Define("y", "x*2 +1j") is C++, so where does that come from? Is that a pythonization, @etejedor ?

No it's not a pythonization, it's cppyy translating std::complex<double> to Python's complex since it crossed the C++-Python boundary (@ianna I guess cpp_ref here is one of the elements of the collection you obtain with Take?).

ianna commented 2 years ago

But IIUC j in data_frame.Define("y", "x*2 +1j") is C++, so where does that come from? Is that a pythonization, @etejedor ?

No it's not a pythonization, it's cppyy translating std::complex<double> to Python's complex since it crossed the C++-Python boundary (@ianna I guess cpp_ref here is one of the elements of the collection you obtain with Take?).

Yes. However, it is strange that a string that should(?) be interpreted as C++ would interpret it as a Python definition of complex and make it a C complex...

etejedor commented 2 years ago

Yes. However, it is strange that a string that should(?) be interpreted as C++ would interpret it as a Python definition of complex and make it a C complex...

In data_frame.Define("y", "x*2 +1i"), the Python string "x*2 +1i" is passed to C++ as an std::string. That std::string contains a C++ expression, which will be resolved to the std::complex<double> type when jitted internally by RDataFrame. This means that the type of that column in C++ is std::complex<double>. But when you bring one of the elements of that column to the Python world, the C++ std::complex<double> is translated to a Python's complex (in the same way that e.g. a C++ double is translated to Python's float).

etejedor commented 2 years ago

@Axel-Naumann should I assign this to you / @jalopezg-r00t for the support of _Complex or you think it's enough to ask people to use std::complex expressions?