ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
5.07k stars 586 forks source link

feat: UX: improve Interval API #8165

Closed NickCrews closed 7 months ago

NickCrews commented 7 months ago

Is your feature request related to a problem?

I have this function

def dob_to_age(dob: DateValue) -> FloatingValue:
    """Convert a date of birth to an age in years (float)."""
    return (ibis.now().date() - dob).days.cast(int) / 365.25

I had several annoying experiences implementing it

  1. Dates and Times use a method for getting a component, eg date.month(). Interval uses a property. We should be consistent. I would vote for methods. Perhaps there are other places where there is this inconsistency we should consider when making this decision?
  2. The docstring and docs for Interval.days don't say the return type (IntervalValue), so I had to experiment to figure that out. If we moved to methods, the docs would show the return type, so that would be better. But still I think the docstring should also say this.
  3. The cast to int is needed because otherwise I get unsupported operand type(s) for /: 'IntervalScalar' and 'float'. I expect .days() to give me an IntegerValue, like Date.month() and friends.
  4. It would be nice if this was a oneliner (ibis.now().date() - dob).years(). But I get a Cannot convert interval value from unit D to unit Y with this. Which I think makes sense, because semantics to time are not strict and in some interpretations there aren't 365.25 days in a year. Not sure there is anything to do about this, but just wanted to flag this as something I'm running into. Not sure what backends do, but probably we are at their mercy and should just do what they do.

Describe the solution you'd like

I think the best usage would be (ibis.now().date() - dob).days() / 365.25

  1. move to methods
  2. return IntegerValue
  3. add return type in docstring
  4. in class docstring, add note about limitations of casting units.

What version of ibis are you running?

main

What backend(s) are you using, if any?

No response

Code of Conduct

cpcloud commented 7 months ago

Thanks for the issue.

I think for the specific case of age and for any case where you need an integer number of some unit, give the delta method a try.

Ideally your function could be:

def dob_to_age(dob: DateValue) -> FloatingValue:
    """Convert a date of birth to an age in years (float)."""
    return ibis.now().delta(dob, "days") / 365.25

The other issues are somewhat orthogonal here:

  1. Methods versus properties: we're considering deprecating a bunch of these properties on IntervalValue because of their fraught nature.
  2. This is addressed by the delta method
  3. We can add the return type in the docstring. Would you mind submitting a PR for that?
  4. I think casting here is a huge anti-pattern and a band-aid for an API that is hard to use. I think delta + manual unit conversion is probably the only sane way right now to get what you want, especially because how you want to handle the ambiguities in unit definitions should be up to you as the user of ibis/your application/library.
cpcloud commented 7 months ago

Let's discuss more about intervals in #8138. See https://github.com/ibis-project/ibis/issues/8138#issuecomment-1919702297 for details.