plotly / plotly.py

The interactive graphing library for Python :sparkles: This project now includes Plotly Express!
https://plotly.com/python/
MIT License
16.11k stars 2.54k forks source link

[Feature Request] Replace dependency 'retrying' with 'tenacity' to allow possible Pyodide compatibility. #2907

Closed jmsmdy closed 3 years ago

jmsmdy commented 3 years ago

Pyodide is a project by Firefox which allows running python code in the browser. It currently allows importing wheels from PyPi, and a number of packages such as numpy, pandas, and matplotlib work in Pyodide. Plotly does not seem to work, at least partly because of its dependency on 'retrying', which hasn't been updated since 2014 and doesn't have a wheel on PyPi (manually generating a wheel doesn't seem to work either).

I want to make the request to replace all usage of 'retrying' in Plotly with the maintained fork 'tenacity'.

nicolaskruchten commented 3 years ago

We'd probably accept a PR for this if you want to give it a shot!

Alex-Monahan commented 3 years ago

Wow, we just talked about running plotly.py in Pyodide today. We would be super interested in using this capability as well! We are trying to get plotly-express functionality in the browser.

jmsmdy commented 3 years ago

So, I tried building a wheel for plotly with the few references to retrying replaced by tenacity, but I'm having trouble loading the wheel into Pyodide. You can check out the wheel yourself here: https://test.pypi.org/project/plotly-jmsmdy/4.12.1/

When I try to load it into Pyodide via micropip, I got the strange error "File is not a zip file". Looking into the js console logs, I found the following error: Access to XMLHttpRequest at 'https://test-files.pythonhosted.org/packages/81/65/17b03a5b25a0496a6ef27cc3ecc15a8137f9e3a990518ee78b65aa8835c1/plotly_jmsmdy-4.12.1-py2.py3-none-any.whl' from origin 'https://alpha.iodide.app' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.

From what I've read, this is a cross-origin check. So the reason this wheel isn't loading is because it's not hosted on pypi (but test.pypi). I have no trouble installing the wheel locally, so it seems to be an issue with the cross-origin policy in Pyodide. It should work if the changes are merged into the next release on PyPi.

I've created a pull request here: https://github.com/plotly/plotly.py/pull/2911

jmsmdy commented 3 years ago

OK, I've worked around the cross-origin issue, so you can test plotly on Pyodide for yourself! Here's the code to import:

import micropip

micropip.install('https://cors-anywhere.herokuapp.com/' + 'https://test-files.pythonhosted.org/packages/81/65/17b03a5b25a0496a6ef27cc3ecc15a8137f9e3a990518ee78b65aa8835c1/plotly_jmsmdy-4.12.1-py2.py3-none-any.whl')

To get plotly plots to display, you should save to a file in the "virtual file system", then read in the html file to a string, then set the srcdoc of an iframe to that string. E.g., create a markdown cell like:

%% md
#### Here's a DOM element to manipulate:
<iframe id="targetElement", srcdoc='Change Me!'></iframe>

Then later load the plot into the string plot_html, and run a cell like the following:

%% py
from js import document
elt = document.getElementById("targetElement")
elt.srcdoc = plot_html

I'm sure there is a cleaner way to do this, but this should work for now just to get some plots displaying.

Alex-Monahan commented 3 years ago

Thank you for posting your wheel! We were able to get plotly express in the browser working with Pyodide! We ended up using a shared worker, so we only have to load the Python packages once per browser session, so that speeds things up for our use case (we tend to have many tabs open at once, all pointed to our visualization service).

nicolaskruchten commented 3 years ago

Wow 🤩 that's really neat! Is there a web-accessible demo somewhere I can poke at?

nicolaskruchten commented 3 years ago

I'm sorry it took so long to sort this out, but v5.0 came out this morning, and it includes your fix replacing retrying with tenacity ... I'd love to get a demo of Plotly working in Pyodide running: how can I get started with it? :)

jmsmdy commented 3 years ago

Hi @nicolaskruchten, here's a codepen which is running plotly==5.0.0 loaded from pypi, running on the latest dev version of pyodide (0.18.0dev0): https://codepen.io/jmsmdy/pen/MWpdjVZ

This example has an editable text area where you can make any plotly chart you want (just have the final result be assigned to a variable called 'fig'). It takes a while to load pyodide and plotly, but after that you can change the python code and near-instantly see the result. Try changing lambda x: x**3 to lambda x: x**4, for example, and then click "Generate Plot".

Notes:

Alex-Monahan commented 3 years ago

We use a similar approach with just a couple of differences. We just use Plotly Express to generate the config json information and pass that into Plotly.js to then render the chart (rather than building the full html in plotly express). I'm not sure if that is any faster though - we have not benchmarked it!

We also put Pyodide in a shared worker so that multiple tabs can share the same Python interpreter. Unfortunately, this can't be cached across multiple refreshes (that I can tell!)

The last main difference is that we self host pyodide and the plotly wheel since we are behind a firewall in a corporate setting.

Our biggest pain point is really the initialization time. We are seeing ~40 seconds to get the chart rendered the first time. If that is something that could be sped up in some way, we would definitely benefit! Similarly, if any kind of caching could be implemented, that would be a tremendous help.

jmsmdy commented 3 years ago

A ServiceWorker is the way to go (although it requires a lot of care to setup correctly to make sure cache invalidation works, otherwise end users may be stuck with an outdated version): https://developer.mozilla.org/en-US/docs/Web/API/Service_Worker_API

A ServiceWorker is basically a simple proxy/cache server written in JS, that is run locally on the end user's browser, and persists across broswer sessions. This is designed to allow PWAs (Progressive Web Apps), which are apps writen in HTML+CSS+JS that can run entirely offline, but access more content/updates online as they become available. I think the default storage limit is 512mb, which is more than enough for this purpose. You can set it up so that you have one ServiceWorker for all pages on a given domain, so you cache pyodide and all the libraries once, and it will "serve" those files to all pages on that domain.

jmsmdy commented 3 years ago

@Alex-Monahan By the way, for creating a new plot, using fig.to_html(include_plotlyjs=False) versus using fig.to_json() and passing it to plotly.js -- both should be comparable in speed (at least with the default json encoder). There is a very small overhead to using to_html(include_plotlyjs=False), because the html is basically just the json wrapped by a small amount of html.

However, your approach is still valid, because it makes it easier to use Plotly.update() and other react methods in javascript (which are faster / more efficient in updating an existing plot) if you need to use them for performance reasons (e..g., for doing dynamic animations).

Alex-Monahan commented 3 years ago

Thank you! I will take a closer look at service workers - caching would be a huge help for us.

nicolaskruchten commented 3 years ago

Thanks for sharing @jmsmdy! This is pretty mind-bending stuff ;)

Am I correct in understanding that this demo wouldn't have worked pre-5.0?

jmsmdy commented 3 years ago

@nicolaskruchten Actually, due to the delay, a modified version of plotly was added to the built-in pyodide packages, so this would have worked by removing the ==5.0.0 version spec. The built-in packages are maintained by the pyodide project and served via a pyodide cdn which shadows pypi. Examples include numpy and pandas, which need to be patched because they use C extensions which need to be changed to work on WASM.

The maintainers of pyodide are trying to get as many of their patches pushed upstream as possible. Right now, the version of numpy and pandas available on pyodide are going to be perpetually out of date, and each update requires work by the maintainers (sometimes a significant amount).

The fact that the latest version of plotly can now be pulled straight from pypi means that pyodide no longer needs to maintain a patched version. I should submit a pull request to pyodide remove the the custom plotly version from the pyodide repo so that micropip.install('plotly') on pyodide will pull the latest version from pypi even if no version is specified.

nicolaskruchten commented 3 years ago

Ah, thanks for the explanation! Yes, I think it would be great to be able to help the pyodide folks un-special-case plotly ... if you do open a PR, please tag me in it or link to it from here please? I'm curious to engage with this community :)

Thanks for pushing through on this and apologies again for the delay!