Open SimonG85 opened 2 years ago
Complex numbers are not supported yet by the arrow specification as far as I know: https://arrow.apache.org/docs/cpp/api/datatype.html
There is an unmerged PR, for complex number support in Arrow: https://github.com/apache/arrow/pull/10565
If complex numbers are not in the arrow spec, it is unlikely that it will appear in arrow2 (which is used by polars).
Ok, thanks. It is possible to store a complex number as object, although with limited support? I have a list of columns with sin and cos of such a quantity (identified by sin and cos suffix), and I would create a new column as cos(x) + j sin(x). I can switch to pandas but I prefer to work with polars. It's the last step of my workflow.
You can model the complex number as a StructType, with real and imaginary component. Some thing along the line of: Struct(r: float, i: float)
But then you may have to implement custom arithmetic for this type.
What do you think?
For polars to be adopted by the scientific and engineering communities, complex number support is really crucial.
Personally, I work around this (in py-polars) by representing complex numbers as a Struct, as @potter420 suggest, and implementing the arithmetic by unnesting the struct and operating on the real and imaginary parts (fields). This makes code distribution much trickier.
Would it be an option to implement a data type in Polars that is based on Struct, but with the operators (addition, multiplication, division) defined at the Rust level? No need to wait for Arrow to possibly include complex number specifications.
@monochromatti I'm not sure if I understand you correctly, but you can do this:
pl.select(
pl.struct(x=2, y=3) * pl.struct(x=1.5, y=5)
)
# shape: (1, 1)
# ┌───────────┐
# │ x │
# │ --- │
# │ struct[2] │
# ╞═══════════╡
# │ {3.0,15} │
# └───────────┘
@cmdlineluser Unfortunately the arithmetic for complex numbers is different than what Polars assumes when multiplying Structs. For example, the product of two complex numbers is
$(a+ib)(c+id) = (ac - bd) + i(ad + bc)$
Or in tuple notation
(a, b) * (c, b) = (ac-bd, ad+bc)
The product of Structs is (a,b) * (c, d) = (ac, bd)
. My current solution is to define the arithmetic in an external function, and try to keep a consistent use of Struct with field-names "real" and "imag".
def multiply(expr1: pl.Expr, expr2: pl.Expr) -> pl.Struct:
a, b = expr1.struct.field("real"), expr1.struct.field("imag")
c, d = expr2.struct.field("real"), expr2.struct.field("imag")
real = a * c - b * d
imag = a * d + b * c
return pl.struct(real.alias("real"), imag.alias("imag"))
An example use: df.with_columns(multiply(pl.col("z1"), pl.col("z2")).alias("z1*z2")
.
I've seen list datatypes and complex numbers are not included.
I know, rust doesn't have support for complex numbers natively, but num does (which is included in dependence list of polars, version 0.4.0).
I can help for pr although I'm not an expert of rust (I use polars with it's python bind).