functime-org / functime

Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.
https://docs.functime.ai
Apache License 2.0
970 stars 52 forks source link

fixed bugs in test_tsfresh #213

Closed abstractqqq closed 1 month ago

abstractqqq commented 2 months ago

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Now all tests in test_tsfresh.py should pass.

  1. I am not sure about the necessity of "test_ratio_beyond_r_sigma_nan_case". The "nan_case" tests are causing us troubles because Polar's behavior has changed. I do not want to add a when pl.len() >= 1 for every feature we add. If user gets an error because they are calling std / sum / abs / any Polars expression on a null (empty) column, Polars now will throw errors instead of returning nan.
  2. Updated Polar's plugin version and all the packages.
  3. Fixed other tests where the names of columns do not match, but outputs are correct.

Any other comments?

vercel[bot] commented 2 months ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
functime-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 20, 2024 7:07am
baggiponte commented 2 months ago

Ciao tq, thank you very much for the PR. Will try to review this ASAP. Not super expert on this branch of the codebase.

abstractqqq commented 1 month ago

Ciao tq, thank you very much for the PR. Will try to review this ASAP. Not super expert on this branch of the codebase.

In my second commit, I removed a duplicate test in ts_fresh and fixed a bug in yeojohnson in test_preprocess

Passed all tears. Tears of joy 🥹

baggiponte commented 1 month ago

OK here I am. Just two things before approving:

  1. Did you run linters/formatters etc?
  2. I see in the first commit that you commented out some tests (am I right?). Could you briefly explain why?

Then we're ready to go. Thank you very much.

abstractqqq commented 1 month ago

I explained in my commit's comment, here is short version: the "nan_case" tests are unnecessary in my opinion. This occurs when some feature is called on a null series ( completely empty). In that case let Polars throw the user an error. If we really want to handle that case, we will need to write a pl.when(input.len() == 0).then().otherwise()... which in my opinion is quite pointless and redundant

baggiponte commented 1 month ago

Ciao tq, sorry for getting back late. One last question: I saw you added the rust-toolchain file. Should we keep it pinned with 2024-04-15? If so, I will just run a round of linter + formatter and merge :)

abstractqqq commented 1 month ago

Adding the toolchain file helped me solve some issues regarding package updates (polars rs and faer)

My local run didn't cause any problems so I guess it will be fine. Let it spin and take a look?

baggiponte commented 1 month ago

Adding the toolchain file helped me solve some issues regarding package updates (polars rs and faer)

My local run didn't cause any problems so I guess it will be fine. Let it spin and take a look?

Yes, sorry if I wasn't clear. I wonder why you used nightly-2024-04-15 but it makes sense. It's rust 1.79 but "frozen" at that date. Will merge PR. We can switch to stable when it's released (I read June 13th).