functime-org / functime

Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.
https://docs.functime.ai
Apache License 2.0
970 stars 52 forks source link

refactor: Properly annotate `train_test_split` #210

Closed baggiponte closed 2 months ago

baggiponte commented 2 months ago

This has been bugging me for a while. Though I can't get rid of one error:

image

@FBruzzesi have any ideas? (Taking the liberty to tag @marcogorelli too, whom I know hates function overloads)

vercel[bot] commented 2 months ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
functime-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 30, 2024 9:32pm
baggiponte commented 2 months ago

Another idea would be to remove the eager argument and make it a mode: Literal["eager", "lazy"] - which I would significantly prefer. But the boolean eager is also widely used in Polars, so I wanted to stay consistent with that (and change even less code).

FBruzzesi commented 2 months ago

I think you need to add another overload-ed specification:

@overload
def _splitter_train_test(
    X: Union[pl.DataFrame, pl.LazyFrame],
    test_size: Union[int, float],
    eager: bool,
) -> Union[Tuple[pl.DataFrame, pl.DataFrame], Tuple[pl.LazyFrame, pl.LazyFrame]]: ...

With that inplace I am not getting any complaint

MarcoGorelli commented 2 months ago

yup - I'd also suggest

eager: Literal[False] = ...,

as that's the default

Taking the liberty to tag @MarcoGorelli too, whom I know hates function overloads

😄 they're ok, i just don't like if they're overdone (like most things..)

baggiponte commented 2 months ago

yup - I'd also suggest

eager: Literal[False] = ...,

as that's the default

Where would you add that?

I think you need to add another overload-ed specification:

So I need three overload, plus the function body, for one boolean variable argument?

MarcoGorelli commented 2 months ago

Where would you add that?

where you already have eager: Literal[False], just add = ... at the end

baggiponte commented 2 months ago

Where would you add that?

where you already have eager: Literal[False], just add = ... at the end

Oh clear. I didn't really understand where I should do that: only for the parameters that have default values that I am overriding?

FBruzzesi commented 2 months ago

@baggiponte bool is the worst scenario 😂 docs