emilyriederer / website

Blog / website repo
https://emilyriederer.com
3 stars 1 forks source link

polars’ Rgonomic Patterns | Emily Riederer #62

Open utterances-bot opened 6 months ago

utterances-bot commented 6 months ago

polars’ Rgonomic Patterns | Emily Riederer

In this follow-up post to Python Rgonomics, we deep dive into some of the advanced data wrangling functionality in python’s polars package to see how it’s powertools like column selectors and nested data structures mirror the best of dplyr and tidyr’s expressive and concise syntax

https://www.emilyriederer.com/post/py-rgo-polars/

tashapiro commented 6 months ago

awesome post Emily! this is a nice rosetta stone for tidyverse natives.

first time learning about polars, i think you just convinced me to give data wrangling in Python another shot 😄

stevecrawshaw commented 6 months ago

Thanks Emily I think you've captured the key aspects of the core functions in both these libraries very concisely. I am a tidyverse native, but learned Pandas as part of a data analyst apprenticeship and found it very frustrating. Polars however I find easier to implement and it is amazingly fast.

muhsinciftci commented 6 months ago

Dear Emily, this is indeed a great post. As a person coming from R, I wonder if it is possible to pass functions in parallel across structures (similar to future_map to nested data frames in R). Ex: Passing some regressions through nested df. Many thanks

emilyriederer commented 6 months ago

@tashapiro - Thanks for reading and can't wait to see what you build! Big fan of your work

@stevecrawshaw - Completely agree. I never want to dunk on pandas because it's impact for python users is truly impressive, but I can't say I ever have found is aesthetically enjoyable. I think it's probably unfortunate if that's the main intro some people get to python

@muhsinciftci - There are similar mapping functions like map_rows() but unfortunately its harder to get them to return complex objects like models back to the dataframe because of the underlying Arrow data abstraction (some discussion of that here.) Nesting is still super useful for other things like moving in and out of JSON though!

jimjam-slam commented 5 months ago

Thanks Emily—definitely bookmarking this for my next Python project!

quantPsych commented 2 months ago

Thanks for the blog. What do you think of pyjanitor? It is inspired by dplyr and build on top of pandas.