dvgodoy / handyspark

HandySpark - bringing pandas-like capabilities to Spark dataframes
MIT License
185 stars 23 forks source link

Options for hist() plot #6

Closed sophia-wright-blue closed 5 years ago

sophia-wright-blue commented 5 years ago

this is a really useful library @dvgodoy! I have a question related to the options available for the hist() plot. The command hdf.cols['Embarked'].hist(ax=axs[0]) does not accept a lot of the keywords that are usually available with the pandas hist() plot. For e.g., bins is accepted, but grid=True is not accepted and range is not accepted.

How do I find out what keyword arguments can be passed to handyspark dataframe plots? I'd greatly appreciate your feedback. Thanks,

dvgodoy commented 5 years ago

Hi Sophia,

Thank you very much! For all plots, the ax option is always available. In the current release, hist() only has bins argument. For the boxplot(), it is possible to specify showfliers and k (for defining which points are to be considered outliers). The scatterplot() does not take any extra arguments. I am working on fixing some performance bottlenecks and I will likely improve the compatibility of the plots with regular usage :-)

Another thing: the keyword arguments are currently being passed to plt.subplots when creating stratified plots.

sophia-wright-blue commented 5 years ago

thanks for that information @dvgodoy , looking forward to more releases of this amazing library!