Closed MarcoGorelli closed 8 months ago
Thanks for the issue, definitely looks like a bug!
Looking at this some more, we generally don't support the use of _.col.desc()
/.asc()
in the mutate
or select
APIs.
It "works" with the Polars backend, but I think that's accidental.
Those expressions are only allowed in an order_by
, so the bug here is that we don't enforce that constraint.
Fixed in the-epic-split
by #8100. Closing.
@MarcoGorelli I believe the way to solve this in Ibis (or SQL/relational model in general) is with a join -- perhaps there is a more succinct way
[ins] In [1]: import ibis
[ins] In [2]: import ibis.selectors as s
[nav] In [3]: ibis.options.interactive = True
[ins] In [4]: t = ibis.examples.penguins.fetch()
[ins] In [5]: t = t.mutate(row_number=ibis.row_number()).relocate("row_number").order_by("row_number")
[ins] In [6]: t
Out[6]:
┏━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ row_number ┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃
┡━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ int64 │ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │
├────────────┼─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ 0 │ Adelie │ Torgersen │ 39.1 │ 18.7 │ 181 │ 3750 │ male │ 2007 │
│ 1 │ Adelie │ Torgersen │ 39.5 │ 17.4 │ 186 │ 3800 │ female │ 2007 │
│ 2 │ Adelie │ Torgersen │ 40.3 │ 18.0 │ 195 │ 3250 │ female │ 2007 │
│ 3 │ Adelie │ Torgersen │ NULL │ NULL │ NULL │ NULL │ NULL │ 2007 │
│ 4 │ Adelie │ Torgersen │ 36.7 │ 19.3 │ 193 │ 3450 │ female │ 2007 │
│ 5 │ Adelie │ Torgersen │ 39.3 │ 20.6 │ 190 │ 3650 │ male │ 2007 │
│ 6 │ Adelie │ Torgersen │ 38.9 │ 17.8 │ 181 │ 3625 │ female │ 2007 │
│ 7 │ Adelie │ Torgersen │ 39.2 │ 19.6 │ 195 │ 4675 │ male │ 2007 │
│ 8 │ Adelie │ Torgersen │ 34.1 │ 18.1 │ 193 │ 3475 │ NULL │ 2007 │
│ 9 │ Adelie │ Torgersen │ 42.0 │ 20.2 │ 190 │ 4250 │ NULL │ 2007 │
│ … │ … │ … │ … │ … │ … │ … │ … │ … │
└────────────┴─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘
[ins] In [7]: a = t.select(~s.c("species"))
[ins] In [8]: b = t.select("row_number", "species")
[ins] In [9]: c = a.join(b.order_by(ibis.desc("species")).mutate(row_number=ibis.row_number()), "row_number").order_by("row_number").relocate("row_number", "species")
[ins] In [10]: c
Out[10]:
┏━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ row_number ┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃
┡━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ int64 │ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │
├────────────┼─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ 0 │ Gentoo │ Torgersen │ 39.1 │ 18.7 │ 181 │ 3750 │ male │ 2007 │
│ 1 │ Gentoo │ Torgersen │ 39.5 │ 17.4 │ 186 │ 3800 │ female │ 2007 │
│ 2 │ Gentoo │ Torgersen │ 40.3 │ 18.0 │ 195 │ 3250 │ female │ 2007 │
│ 3 │ Gentoo │ Torgersen │ NULL │ NULL │ NULL │ NULL │ NULL │ 2007 │
│ 4 │ Gentoo │ Torgersen │ 36.7 │ 19.3 │ 193 │ 3450 │ female │ 2007 │
│ 5 │ Gentoo │ Torgersen │ 39.3 │ 20.6 │ 190 │ 3650 │ male │ 2007 │
│ 6 │ Gentoo │ Torgersen │ 38.9 │ 17.8 │ 181 │ 3625 │ female │ 2007 │
│ 7 │ Gentoo │ Torgersen │ 39.2 │ 19.6 │ 195 │ 4675 │ male │ 2007 │
│ 8 │ Gentoo │ Torgersen │ 34.1 │ 18.1 │ 193 │ 3475 │ NULL │ 2007 │
│ 9 │ Gentoo │ Torgersen │ 42.0 │ 20.2 │ 190 │ 4250 │ NULL │ 2007 │
│ … │ … │ … │ … │ … │ … │ … │ … │ … │
└────────────┴─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘
What happened?
Looks like the results differ? (if I could hazard a guess - is pandas' index autoaligning something under the hood?)
What version of ibis are you using?
7.2.0
What backend(s) are you using, if any?
pandas / polars
Relevant log output
No response
Code of Conduct