has2k1 / plotnine

A Grammar of Graphics for Python
https://plotnine.org
MIT License
4.02k stars 217 forks source link

Offer of plotnine_ex.py #82

Closed jrbrearley closed 6 years ago

jrbrearley commented 6 years ago

Hi Hassan: I am wondering if you would consider adding the following file, plotnine_ex.py to your repository? Having a working sample files that you can download and run immediately really helps new users.

plotnine_ex.zip

has2k1 commented 6 years ago

The idea of a command tutorial is nice, but it would have to be better than can be demonstrated in a jupyter notebook. The documentation website has a place for tutorials. The files themselves are part of the plotnine-examples repository.

Also, from the script I see that you wrote a data loading function. The right way to do it is.

from plotnine.data import mtcars, meat

That also takes care of any categorical and date columns.

I think the way forward, is to turn the script into an introductory jupyter notebook tutorial, and also include a link to the tutorials section in the README.rst on github.

jrbrearley commented 6 years ago

Hi Hassan: Thanks for the link to the docs website. WBN if you added a ../docs directory with a weblink to this website so other people can easily find it.

I played around more with your jupyter files and the others in other packages. I will grant you that with enough copy / pasting from the web page into the PYTHON IDLE the 'elements' example eventually ran without error. I continue to prefer an actual script works on the first try.

Also, keeping the examples as a separate download package plotnine_examples obscures them from potential users. WBN if they were in the above proposed ../docs directory as part of the main plotnine package.

BTW, gnuplot current release includes 148 working demo scripts in demo directory, each of which does multiple demo graph plots. Also has 6 huge PDFs in docs directory.

The WinPython package from sourceforge.net, which now includes plotnine as part of the 400 MB .exe install file, does NOT like your way of loading the data/*.csv files, see:

C:\WinPython\scripts>python Python 3.6.3 (v3.6.3:2c5fed8, Oct 3 2017, 18:11:49) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

from plotnine.data import mtcars Traceback (most recent call last): File "", line 1, in ImportError: cannot import name 'mtcars' from plotnine.data import meat Traceback (most recent call last): File "", line 1, in ImportError: cannot import name 'meat'

I get this error from both the command line window and the IDLE GUI window.

The older style command line python does load the file, but with warnings: C:\JOHN_MIS>python Python 3.6.3 (v3.6.3:2c5fed8, Oct 3 2017, 17:26:49) [MSC v.1900 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.

from plotnine.data import mtcars c:\python\lib\site-packages\statsmodels\compat\pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future versi on. Please use the pandas.tseries module instead. from pandas.core import datetools

If there is an easy way to convert my script to .ipynb format, feel free. I would encourage you to also make the .py executable script available as well.

has2k1 commented 6 years ago

Jupyter notebooks are the convenient format most users. They are in a separate repository so as not to balloon up the size of the main repository.

... does NOT like your way of loading the data/*.csv files

That's not good. I do not use windows so I cannot debug the issue. Either way, WinPython is doing something wrong.

I'll create an introductory tutorial for the website, and I'll extract some examples from your script, the lines example for sure.

jrbrearley commented 6 years ago

Hi Hassan: Can you give me a hint as to which routines should have run for the import mtcars? Maybe I can get you more info. I can tell you right now that the init.py in the data directory does NOT run when plotnine package loads.

has2k1 commented 6 years ago

plotnine.data is a package so the plotnine/data/__init__.py only runs when you import from plotnine.data.

The way from plotnine.data import mtcars is failing is not informative enough. For just the usual pocking around, what are the result of

# 1.
import plotnine
dir(plotnine)

# 2.
import plotnine.data as pdata
dir(pdata)

And what is the file structure of .../python/site-packages/plotnine/? Does the plotnine/data sub-package have the __init__.py file? That is all I can think of at the moment.

jrbrearley commented 6 years ago

Hi Hassan: I added ‘print (“hello from plotnine.data”)’ to the init.py files in my 2 python installations. This debug statement, along with your suggestions, lead me to the real problem. In the WinPython plotnine/data directory, the filename init .py has a space, (highlighted in red) before the .py, so presumably python doesn’t find the file. When I rename the file to remove the space, things work much better.

In the above text, this webpage keeps converting double underscores to bold!

So the fun question is: How did the space get into the filename? Does the github repository have separate files for Windows release?

When I get a chance, I will try installing WinPython 3.6.3 from sourceforge.net on another Win7 PC and see what happens.


My full GUI WinPython install from sourceforge.net gives the following.

C:\WinPython\scripts>python Python 3.6.3 (v3.6.3:2c5fed8, Oct 3 2017, 18:11:49) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import plotnine

dir(plotnine) ['all', 'builtins', 'cached', 'doc', 'file', 'loader', ' name', 'package', 'path', 'spec', 'version', '_get_all_impor ts', 'absolute_import', 'aes', 'annotate', 'arrow', 'as_labeller', 'coord_cartes ian', 'coord_equal', 'coord_fixed', 'coord_flip', 'coord_trans', 'coords', 'doct ools', 'element_blank', 'element_line', 'element_rect', 'element_text', 'excepti ons', 'expand_limits', 'facet_grid', 'facet_null', 'facet_wrap', 'facets', 'geom _abline', 'geom_area', 'geom_bar', 'geom_bin2d', 'geom_blank', 'geom_boxplot', ' geom_col', 'geom_count', 'geom_crossbar', 'geom_density', 'geomdotplot', 'geom errorbar', 'geom_errorbarh', 'geom_freqpoly', 'geom_histogram', 'geom_hline', 'g eom_jitter', 'geom_label', 'geom_line', 'geom_linerange', 'geom_path', 'geom_poi nt', 'geom_pointrange', 'geom_polygon', 'geom_qq', 'geom_quantile', 'geom_rect', 'geom_ribbon', 'geom_rug', 'geom_segment', 'geom_smooth', 'geom_spoke', 'geom_s tep', 'geom_text', 'geom_tile', 'geom_violin', 'geom_vline', 'geoms', 'ggplot', 'ggsave', 'ggtitle', 'guide_colorbar', 'guide_colourbar', 'guide_legend', 'guide s', 'label_both', 'label_context', 'label_value', 'labeller', 'labels', 'labs', 'layer', 'lims', 'options', 'position_dodge', 'position_fill', 'position_identit y', 'position_jitter', 'position_jitterdodge', 'position_nudge', 'position_stack ', 'positions', 'qplot', 'scale_alpha', 'scale_alpha_continuous', 'scale_alpha_d atetime', 'scale_alpha_discrete', 'scale_alpha_identity', 'scale_alpha_manual', 'scale_color_brewer', 'scale_color_cmap', 'scale_color_continuous', 'scale_color _datetime', 'scale_color_desaturate', 'scale_color_discrete', 'scale_color_disti ller', 'scale_color_gradient', 'scale_color_gradient2', 'scale_color_gradientn', 'scale_color_gray', 'scale_color_grey', 'scale_color_hue', 'scale_color_identit y', 'scale_color_manual', 'scale_colour_brewer', 'scale_colour_cmap', 'scale_col our_continuous', 'scale_colour_datetime', 'scale_colour_desaturate', 'scale_colo ur_discrete', 'scale_colour_distiller', 'scale_colour_gradient', 'scale_colour_g radient2', 'scale_colour_gradientn', 'scale_colour_gray', 'scale_colour_grey', ' scale_colour_hue', 'scale_colour_identity', 'scale_colour_manual', 'scale_fill_b rewer', 'scale_fill_cmap', 'scale_fill_continuous', 'scale_fill_datetime', 'scal e_fill_desaturate', 'scale_fill_discrete', 'scale_fill_distiller', 'scale_fill_g radient', 'scale_fill_gradient2', 'scale_fill_gradientn', 'scale_fill_gray', 'sc ale_fill_grey', 'scale_fill_hue', 'scale_fill_identity', 'scale_fill_manual', 's cale_linetype', 'scale_linetype_continuous', 'scale_linetype_discrete', 'scale_l inetype_identity', 'scale_linetype_manual', 'scale_shape', 'scale_shape_continuo us', 'scale_shape_discrete', 'scale_shape_identity', 'scale_shape_manual', 'scal e_size', 'scale_size_area', 'scale_size_continuous', 'scale_size_datetime', 'sca le_size_discrete', 'scale_size_identity', 'scale_size_manual', 'scale_size_radiu s', 'scale_stroke', 'scale_stroke_continuous', 'scale_stroke_discrete', 'scale_x _continuous', 'scale_x_date', 'scale_x_datetime', 'scale_x_discrete', 'scale_x_l og10', 'scale_x_reverse', 'scale_x_sqrt', 'scale_x_timedelta', 'scale_y_continuo us', 'scale_y_date', 'scale_y_datetime', 'scale_y_discrete', 'scale_y_log10', 's cale_y_reverse', 'scale_y_sqrt', 'scale_y_timedelta', 'scales', 'stat_bin', 'sta t_bin2d', 'stat_bin_2d', 'stat_bindot', 'stat_boxplot', 'stat_count', 'stat_dens ity', 'stat_ecdf', 'stat_function', 'stat_identity', 'stat_qq', 'stat_quantile', 'stat_smooth', 'stat_sum', 'stat_summary', 'stat_summary_bin', 'stat_unique', ' stat_ydensity', 'stats', 'theme', 'theme_538', 'theme_bw', 'theme_classic', 'the me_dark', 'theme_get', 'theme_gray', 'theme_grey', 'theme_light', 'theme_linedra w', 'theme_matplotlib', 'theme_minimal', 'theme_seaborn', 'theme_set', 'theme_up date', 'theme_void', 'theme_xkcd', 'themes', 'utils', 'watermark', 'xlab', 'xlim ', 'ylab', 'ylim']

import plotnine.data as pdata NB: MISSING: hello from plotnine.data message

dir (pdata) ['doc', 'loader', 'name', 'package', 'path', 'spec'] NB: MISSING many items here, see tests below for non-GUI Python

pdata.mtcars.head() Traceback (most recent call last): File "", line 1, in AttributeError: module 'plotnine.data' has no attribute 'mtcars'

C:\WinPython\scripts>cd ..\python-3.6.3.amd64\lib\site-packages\plotnine\data

C:\WinPython\python-3.6.3.amd64\Lib\site-packages\plotnine\data>dir Volume in drive C is JRB3_C Volume Serial Number is 8A39-4370

Directory of C:\WinPython\python-3.6.3.amd64\Lib\site-packages\plotnine\data

2017-11-22 01:15 PM

. 2017-11-22 01:15 PM .. 2017-10-31 05:08 PM 2,772,143 diamonds.csv 2017-10-31 05:08 PM 21,860 economics.csv 2017-10-31 05:08 PM 123,339 economics_long.csv 2017-10-31 05:08 PM 298,297 faithfuld.csv 2017-10-31 05:08 PM 1,678 huron.csv 2017-10-31 05:08 PM 39,004 luv_colours.csv 2017-10-31 05:08 PM 62,608 meat.csv 2017-10-31 05:08 PM 98,022 midwest.csv 2017-10-31 05:08 PM 16,046 mpg.csv 2017-10-31 05:08 PM 6,773 msleep.csv 2017-10-31 05:08 PM 1,787 mtcars.csv 2017-10-31 05:08 PM 13,559 pageviews.csv 2017-10-31 05:08 PM 511 presidential.csv 2017-10-31 05:08 PM 57,035 seals.csv 2017-10-31 05:08 PM 523,993 txhousing.csv 2017-10-31 05:08 PM 13,793 init - Copy.py 2017-11-22 01:44 PM 14,300 init .py 2017-11-09 10:53 AM pycache 17 File(s) 4,064,748 bytes 3 Dir(s) 388,678,316,032 bytes free


My non-GUI Python install from https://www.python.org/downloads/ hit issues doing ‘pip install plotnine’. It wanted microsoft cpp build tools; the error message gave me a link to visualcppbuildtools_full_2015.exe. Once the build tools were available, then the ‘pip install plotnine’ ran OK. Not sure why there is an extra plotnine-0.3.0-py3.6.egg subdirectory in the file structure.

Microsoft Windows [Version 6.1.7601] Copyright (c) 2009 Microsoft Corporation. All rights reserved.

c:>python Python 3.6.3 (v3.6.3:2c5fed8, Oct 3 2017, 17:26:49) [MSC v.1900 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import plotnine c:\python\lib\site-packages\statsmodels\compat\pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future versi on. Please use the pandas.tseries module instead. from pandas.core import datetools

dir(plotnine) ['all', 'builtins', 'cached', 'doc', 'file', 'loader', ' name', 'package', 'path', 'spec', 'version', '_get_all_impor ts', '_version', 'absolute_import', 'aes', 'annotate', 'arrow', 'as_labeller', ' coord_cartesian', 'coord_equal', 'coord_fixed', 'coord_flip', 'coord_trans', 'co ords', 'doctools', 'element_blank', 'element_line', 'element_rect', 'element_tex t', 'exceptions', 'expand_limits', 'facet_grid', 'facet_null', 'facet_wrap', 'fa cets', 'geom_abline', 'geom_area', 'geom_bar', 'geom_bin2d', 'geom_blank', 'geom _boxplot', 'geom_col', 'geom_count', 'geom_crossbar', 'geom_density', 'geom_dotp lot', 'geom_errorbar', 'geom_errorbarh', 'geom_freqpoly', 'geom_histogram', 'geo m_hline', 'geom_jitter', 'geom_label', 'geom_line', 'geom_linerange', 'geom_path ', 'geom_point', 'geom_pointrange', 'geom_polygon', 'geom_qq', 'geom_quantile', 'geom_rect', 'geom_ribbon', 'geom_rug', 'geom_segment', 'geom_smooth', 'geom_spo ke', 'geom_step', 'geom_text', 'geom_tile', 'geom_violin', 'geom_vline', 'geoms' , 'ggplot', 'ggsave', 'ggtitle', 'guide_colorbar', 'guide_colourbar', 'guide_leg end', 'guides', 'label_both', 'label_context', 'label_value', 'labeller', 'label s', 'labs', 'layer', 'lims', 'options', 'position_dodge', 'position_fill', 'posi tion_identity', 'position_jitter', 'position_jitterdodge', 'position_nudge', 'po sition_stack', 'positions', 'qplot', 'scale_alpha', 'scale_alpha_continuous', 's cale_alpha_datetime', 'scale_alpha_discrete', 'scale_alpha_identity', 'scale_alp ha_manual', 'scale_color_brewer', 'scale_color_cmap', 'scale_color_continuous', 'scale_color_datetime', 'scale_color_desaturate', 'scale_color_discrete', 'scale _color_distiller', 'scale_color_gradient', 'scale_color_gradient2', 'scale_color _gradientn', 'scale_color_gray', 'scale_color_grey', 'scale_color_hue', 'scale_c olor_identity', 'scale_color_manual', 'scale_colour_brewer', 'scale_colour_cmap' , 'scale_colour_continuous', 'scale_colour_datetime', 'scale_colour_desaturate', 'scale_colour_discrete', 'scale_colour_distiller', 'scale_colour_gradient', 'sc ale_colour_gradient2', 'scale_colour_gradientn', 'scale_colour_gray', 'scale_col our_grey', 'scale_colour_hue', 'scale_colour_identity', 'scale_colour_manual', ' scale_fill_brewer', 'scale_fill_cmap', 'scale_fill_continuous', 'scale_fill_date time', 'scale_fill_desaturate', 'scale_fill_discrete', 'scale_fill_distiller', ' scale_fill_gradient', 'scale_fill_gradient2', 'scale_fill_gradientn', 'scale_fil l_gray', 'scale_fill_grey', 'scale_fill_hue', 'scale_fill_identity', 'scale_fill _manual', 'scale_linetype', 'scale_linetype_continuous', 'scale_linetype_discret e', 'scale_linetype_identity', 'scale_linetype_manual', 'scale_shape', 'scale_sh ape_continuous', 'scale_shape_discrete', 'scale_shape_identity', 'scale_shape_ma nual', 'scale_size', 'scale_size_area', 'scale_size_continuous', 'scale_size_dat etime', 'scale_size_discrete', 'scale_size_identity', 'scale_size_manual', 'scal e_size_radius', 'scale_stroke', 'scale_stroke_continuous', 'scale_stroke_discret e', 'scale_x_continuous', 'scale_x_date', 'scale_x_datetime', 'scale_x_discrete' , 'scale_x_log10', 'scale_x_reverse', 'scale_x_sqrt', 'scale_x_timedelta', 'scal e_y_continuous', 'scale_y_date', 'scale_y_datetime', 'scale_ydiscrete', 'scale y_log10', 'scale_y_reverse', 'scale_y_sqrt', 'scale_y_timedelta', 'scales', 'sta t_bin', 'stat_bin2d', 'stat_bin_2d', 'stat_bindot', 'stat_boxplot', 'stat_count' , 'stat_density', 'stat_ecdf', 'stat_function', 'stat_identity', 'stat_qq', 'sta t_quantile', 'stat_smooth', 'stat_sum', 'stat_summary', 'stat_summary_bin', 'sta t_unique', 'stat_ydensity', 'stats', 'theme', 'theme_538', 'theme_bw', 'theme_cl assic', 'theme_dark', 'theme_get', 'theme_gray', 'theme_grey', 'theme_light', 't heme_linedraw', 'theme_matplotlib', 'theme_minimal', 'theme_seaborn', 'theme_set ', 'theme_update', 'theme_void', 'theme_xkcd', 'themes', 'utils', 'watermark', ' xlab', 'xlim', 'ylab', 'ylim']

import plotnine.data as pdata hello from plotnine.data

dir(pdata) ['CategoricalDtype', '_ROOT', 'all', 'builtins', 'cached', 'doc' , 'file', 'loader', 'name', 'package', 'path', 'spec', ' _ordered_categories', '_unordered_categories', 'absolute_import', 'categories', 'columns', 'diamonds', 'economics', 'economics_long', 'faithfuld', 'huron', 'luv _colours', 'meat', 'midwest', 'mpg', 'msleep', 'mtcars', 'os', 'pageviews', 'pd' , 'presidential', 'seals', 'txhousing', 'unicode_literals']

pdata.mtcars.head() name mpg cyl disp hp drat wt qsec vs am gear \

0 Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4

1 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4

2 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4

3 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3

4 Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3

carb 0 4 1 4 2 1 3 1 4 2

c:>cd \Python\lib\site-packages\plotnine-0.3.0-py3.6.egg\plotnine\data

c:\Python\Lib\site-packages\plotnine-0.3.0-py3.6.egg\plotnine\data>dir Volume in drive C is JRB3_C Volume Serial Number is 8A39-4370

Directory of c:\Python\Lib\site-packages\plotnine-0.3.0-py3.6.egg\plotnine\data

2017-11-22 01:08 PM

. 2017-11-22 01:08 PM .. 2017-11-17 10:03 AM 2,772,143 diamonds.csv 2017-11-17 10:03 AM 21,860 economics.csv 2017-11-17 10:03 AM 123,339 economics_long.csv 2017-11-17 10:03 AM 298,297 faithfuld.csv 2017-11-17 10:03 AM 1,678 huron.csv 2017-11-17 10:03 AM 39,004 luv_colours.csv 2017-11-17 10:03 AM 62,608 meat.csv 2017-11-17 10:03 AM 98,022 midwest.csv 2017-11-17 10:03 AM 16,046 mpg.csv 2017-11-17 10:03 AM 6,773 msleep.csv 2017-11-17 10:03 AM 1,787 mtcars.csv 2017-11-17 10:03 AM 13,559 pageviews.csv 2017-11-17 10:03 AM 511 presidential.csv 2017-11-17 10:03 AM 57,035 seals.csv 2017-11-17 10:03 AM 523,993 txhousing.csv 2017-11-17 10:03 AM 13,793 init - Copy.py 2017-11-22 01:23 PM 14,299 init.py 2017-11-22 01:24 PM pycache 17 File(s) 4,064,747 bytes 3 Dir(s) 388,677,705,728 bytes free

c:\Python\Lib\site-packages\plotnine-0.3.0-py3.6.egg\plotnine\data>

has2k1 commented 6 years ago

Great.

Does the github repository have separate files for Windows release?

No. WinPython gets the same files directly from pypi. If it is not repeatable you can choke it up to any of malware, antivirus, failing hard disk or cosmic rays.

jrbrearley commented 6 years ago

Hi Hassan: Cosmic rays strike again! Second full install on my laptop ran fine, no extra spaces in init.py file. Thanks for suggestions in the debugging.

has2k1 commented 6 years ago

There is a link to an introductory tutorial at kaggle.