Closed matthias-busch closed 2 years ago
Hi guys,
lovely small little course about Polars. I really enjoyed it and it helped me to get a nice overview about this new library. So thank you for your work! <3
There is however a missing code line in the code example under the video from chapter 7 of the course: https://calmcode.io/polars/over-expresssions.html right in the sessionize function.
sessionize
Basically the line where the session column gets calculated is missing.
What it says:
def sessionize(dataf, threshold=20 * 60 * 1000): return (dataf .sort(["char", "timestamp"]) .with_columns([ (pl.col("timestamp").diff().cast(pl.Int64) > threshold).fill_null(True).alias("ts_diff"), (pl.col("char").diff() != 0).fill_null(True).alias("char_diff"), ]) .with_columns([ (pl.col("ts_diff") | pl.col("char_diff")).alias("new_session_mark") ]) .drop(["char_diff", "ts_diff", "new_session_mark"]))
What it should say to run and function correctly:
def sessionize(dataf, threshold=1_000_000): return (dataf .sort(["char", "timestamp"]) .with_columns([ (pl.col("timestamp").diff().cast(pl.Int64) > threshold).fill_null(True).alias("ts_diff"), (pl.col("char").diff() != 0).fill_null(True).alias("char_diff"), ]) .with_columns([ (pl.col("ts_diff") | pl.col("char_diff")).alias("new_session_mark") ]) .with_columns([ pl.col("new_session_mark").cumsum().alias("session") ]) .drop(['char_diff', 'ts_diff', 'new_session_mark']))
Well spotted! Just made a quick PR. Should be deployed within 5 mins.
Hi guys,
lovely small little course about Polars. I really enjoyed it and it helped me to get a nice overview about this new library. So thank you for your work! <3
There is however a missing code line in the code example under the video from chapter 7 of the course: https://calmcode.io/polars/over-expresssions.html right in the
sessionize
function.Basically the line where the session column gets calculated is missing.
What it says:
What it should say to run and function correctly: