dvgodoy / handyspark

HandySpark - bringing pandas-like capabilities to Spark dataframes
MIT License
185 stars 23 forks source link

koalas vs handyspark #16

Closed sophia-wright-blue closed 5 years ago

sophia-wright-blue commented 5 years ago

it's awesome that you're adding plots (https://github.com/databricks/koalas/issues/293) to koalas @dvgodoy ! I've been using handyspark for plotting with spark dataframes, will you be continuing the development of handyspark or would you recommend switching to koalas?

dvgodoy commented 5 years ago

Hi @sophia-wright-blue Thanks for your support :-) I've always missed the plots while using Spark and that's why I chose that as my first contribution to Koalas. There is a fundamental difference between HandySpark and Koalas. On one hand, Koalas is focused primarily on mimicking pandas for Spark DataFrames - so I expect Koalas to be a better choice for these operations in a very near future. I plan on adding support for groupby in plotting there, as I do with stratify in HandySpark. HandySpark, on the other hand, is not limited to DataFrames - my idea was to tackle different usability aspectss of Spark - like evaluation metrics, imputers and so on. So, I think it will be nice to combine both :-)

sophia-wright-blue commented 5 years ago

thanks for that information @dvgodoy , looking forward to more updates from you on handyspark and koalas!