AutoViML / AutoViz

Automatically Visualize any dataset, any size with a single line of code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.
Apache License 2.0
1.71k stars 197 forks source link

Installation instructions and sample code not working #4

Closed Tronic closed 4 years ago

Tronic commented 5 years ago

You need from autoviz import ... in the sample code. Preferably you should give a sample that can be just copy&pasted and run, and provide pictures of how it looks, so that one could evaluate whether to install this instead of the many other plotting libraries.

The dependencies are extremely heavy. Is it absolutely necessary to install Jupyter? Something inside also depends on sklearn, which was not included in pip deps.

As for CSV reading; if you are not able to autodetect/guess separators and date formats, do not bother "including" it in your library. It is just two lines of code to first load the data with pandas and then use another library for plotting, and in most cases one needs to do something in between anyway (data preprocessing).

An ideal plotting library would have API alike this:

from fictionalplot import Figure  # if possible, keep it to just one simple import

fig = Figure()   # Internally holds graphics context, Qt window, websocket to browser or whatever
fig.plot(df)  # display the graph and return instantly, try to auto-guess suitable format based on df

If using a Qt window, spawn a new process that does not terminate when the Python program ends, and that is automatically shared by all figures of all running programs (don't block execution of the program like Matplotlib does). If using Notebook/browser, you don't need separate process because browser already does that.

For true interactive plots (e.g. receive user input on scaling changes to recalculate new data in Python), use async/await to avoid blocking Python from executing while waiting for user input (but stay away from import asyncio which is utter crap -- instead use trio if you must).

Good luck with your plotting library. We could certainly use some good options (I am not entirely happy with either Matplotlib nor Plotly, and everything else is just bad).

AutoViML commented 5 years ago

Hi Karkkainen: Thanks for your excellent comments. I will go over your suggestions carefully and get back to you. Would you mind trying It after I make some changes? Thanks again, Ram

On Fri, Aug 9, 2019 at 2:09 AM L. Kärkkäinen notifications@github.com wrote:

You need from autoviz import ... in the sample code. Preferably you should give a sample that can be just copy&pasted and run, and provide pictures of how it looks, so that one could evaluate whether to install this instead of the many other plotting libraries.

The dependencies are extremely heavy. Is it absolutely necessary to install Jupyter? Something inside also depends on sklearn, which was not included in pip deps.

As for CSV reading; if you are not able to autodetect/guess separators and date formats, do not bother "including" it in your library. It is just two lines of code to first load the data with pandas and then use another library for plotting, and in most cases one needs to do something in between anyway (data preprocessing).

An ideal plotting library would have API alike this:

from fictionalplot import Figure # if possible, keep it to just one simple import

fig = Figure() # Internally holds graphics context, Qt window, websocket to browser or whatever fig.plot(df) # display the graph and return instantly, try to auto-guess suitable format based on df

If using a Qt window, spawn a new process that does not terminate when the Python program ends, and that is automatically shared by all figures of all running programs (don't block execution of the program like Matplotlib does). If using Notebook/browser, you don't need separate process because browser already does that.

For true interactive plots (e.g. receive user input on scaling changes to recalculate new data in Python), use async/await to avoid blocking Python from executing while waiting for user input (but stay away from import asyncio which is utter crap -- instead use trio if you must).

Good luck with your plotting library. We could certainly use some good options (I am not entirely happy with either Matplotlib nor Plotly, and everything else is just bad).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AutoViML/AutoViz/issues/4?email_source=notifications&email_token=AMKBH6EOULYGKEYPF6QVEA3QDUCXPA5CNFSM4IKQ2ZC2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HEKDUXQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AMKBH6A7DPQ55ZKUBLIFRBTQDUCXPANCNFSM4IKQ2ZCQ .

Tronic commented 5 years ago

I am currently looking at Bokeh which seems quite promising. The server mode is able to stream realtime updates to browser, and receive widget changes back to application. The Bokeh API, especially in server mode, is still too complex (requiring callback functions), but I believe it could be fixed with reasonable effort. Bokeh's complexity is offset by its beautiful Canvas and WebGL based plots. Therefore, right now I cannot evaluate your library further; I'll check back if there turn out to be some fatal shortcomings with Bokeh too.

AutoViML commented 5 years ago

Hi Karkkainen: Sorry to hear about your experience. I have fixed the issues you had importing. It is as simple as 1-2-3:

  1. First install autoviz pip install autoviz

  2. Next initialize the Figure class which I called AutoViz_Class so it can store your images that are generated from your dataframe (df) from autoviz.AutoViz_Class import AutoViz_Class
    AV = AutoViz_Class()

  3. Run AutoViz method from that class above by sending in your data frame (df): dft = AV.AutoViz('', sep, target, df)

I hope this is as simple as you wanted. If adding "sep" is a problem for you, I will investigate how "sep" can be removed. Thanks for your suggestions.

You can check the demo notebook which I have uploaded yesterday that works correctly.

Ram