ux: change non-interactive repr to look more like interactive repr

jcrist commented 1 week ago

Currently when constructing ibis expressions in non-interactive mode (the default), expressions repr as a description of the operations they're composed of:

In [1]: import ibis

In [2]: t = ibis.examples.diamonds.fetch()

In [3]: t.mutate(volume=t.x * t.y * t.z)
Out[3]: 
r0 := DatabaseTable: diamonds
  carat   float64
  cut     string
  color   string
  clarity string
  depth   float64
  table   float64
  price   int64
  x       float64
  y       float64
  z       float64

Project[r0]
  carat:   r0.carat
  cut:     r0.cut
  color:   r0.color
  clarity: r0.clarity
  depth:   r0.depth
  table:   r0.table
  price:   r0.price
  x:       r0.x
  y:       r0.y
  z:       r0.z
  volume:  r0.x * r0.y * r0.z

While this expr repr can be nice for inspection, it's rarely what I want when building up expressions lazily. Since ibis expressions are very composable, rarely do I need to know the steps used to get to a certain expression (e.g. I don't care that a group_by or filter was called earlier). Really all I care about is the schema/type of the object.

I propose we:

Keep around the existing expr repr, but expose it via some other method. Perhaps expr.explain() or something.
Move to using a similar repr as the interactive repr, except showing no rows and only ellipsis. This would give a similar experience to iterating in interactive mode, except without executing anything. For prior art, this is also what dask does.

A quick mockup:

In [1]: import ibis

In [2]: t = ibis.examples.diamonds.fetch()

In [3]: t.mutate(volume=t.x * t.y * t.z)
Out[3]: 
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━┓
┃ carat   ┃ cut       ┃ color  ┃ clarity ┃ depth   ┃ table   ┃ price ┃ x       ┃ y       ┃ z       ┃ volume    ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━┩
│ float64 │ string    │ string │ string  │ float64 │ float64 │ int64 │ float64 │ float64 │ float64 │ float64   │
├─────────┼───────────┼────────┼─────────┼─────────┼─────────┼───────┼─────────┼─────────┼─────────┼───────────┤
│       … │ …         │ …      │ …       │       … │       … │     … │       … │       … │       … │         … │
└─────────┴───────────┴────────┴─────────┴─────────┴─────────┴───────┴─────────┴─────────┴─────────┴───────────┘

In [5]: t.mutate(volume=t.x * t.y * t.z).select("carat", "volume")
Out[5]: 
┏━━━━━━━━━┳━━━━━━━━━━━┓
┃ carat   ┃ volume    ┃
┡━━━━━━━━━╇━━━━━━━━━━━┩
│ float64 │ float64   │
├─────────┼───────────┤
│       … │         … │
└─────────┴───────────┘

cpcloud commented 1 week ago

In all seriousness, I really like this idea!

jcrist commented 1 week ago

Sounds good! I think we should aim to get this in for 10.0 then.

One open question is what to do with scalars (since in interactive mode they only show the value, not the type).

A few options:

Add the type to the interactive repr (but keep scalars unnamed)?

# Interactive
┌────────────┐
│ float64    │
├────────────┤
│   43040.87 │
└────────────┘ 

# Non-interactive (could also only add the type to the non-interactive version?)
┌─────────┐
│ float64 │
├─────────┤
│       … │
└─────────┘

Some non-boxed repr?

# Interactive
┌──────────┐
│ 43040.87 │
└──────────┘

# Non-interactive
Scalar<float64>

Put the type in the box?

# Interactive
┌──────────┐
│ 43040.87 │
└──────────┘

# Non-interactive (this might be easy to mistake for an interactive string scalar with value `"Scalar<float64>"`)
┌─────────────────┐
│ Scalar<float64> │
└─────────────────┘

Something else?

I have a slight preference for the first option, but :shrug:.

gforsyth commented 1 week ago

I definitely like the look of this -- it might be nice to keep the old repr around for OUR inspection, but make it private.

I like option 1 above, but I can get on board with any of them.

drin commented 1 week ago

I randomly found this and just wanted to chime in: I think this sounds like a great idea and moving the old repr to an explain function or something similar makes a lot of sense.

it might be nice to keep the old repr around for OUR inspection, but make it private.

not sure what visibility you mean by private (maybe just surrounded wth __?) but it'd be nice for it to be easily accessible for substrait users. I could also imagine wanting to extend it with various verbosity flags (ops only, ops + predicates, etc.) to make validation or general observability easier.

gforsyth commented 1 week ago

not sure what visibility you mean by private

yeah, just with a leading _ so it doesn't show up in tab-completion, but I'm also not opposed to leaving it more readily available if there's desire for that.

ibis-project / ibis

ux: change non-interactive repr to look more like interactive repr #10095