scikit-hep / pyhf

pure-Python HistFactory implementation with tensors and autodiff
https://pyhf.readthedocs.io/
Apache License 2.0
283 stars 83 forks source link

Try to get pyhf to run in Pyolite / JupyterLite #1775

Closed matthewfeickert closed 2 years ago

matthewfeickert commented 2 years ago

Summary

As pyodide is now running with scipy v1.7.3 (demoed on the NumPy docs) all the components to be able to use pyhf in WebAssembly should be there. It would be interesting to see how much work would be needed to get a minimal working example going and then to run some examples in the docs with JupyterLite.

cc @kratsg as he was thinking the same thing as me when he saw the NumPy tweet.

Additional Information

No response

Code of Conduct

matthewfeickert commented 2 years ago

@henryiii points out that you "can [use] pure python packages with micropip" as the docs mention

Note that in addition to this list, pure Python packages with wheels can be loaded directly from PyPI with micropip.install.

henryiii commented 2 years ago
import micropip
await micropip.install("pyhf")
import pyhf

Screen Shot 2022-02-11 at 12 11 20 PM

henryiii commented 2 years ago

Anything with plots that doesn't use data? scrapbook is not supported, due to pyzmq.

matthewfeickert commented 2 years ago

This is awesome!

import micropip
await micropip.install("pyhf")  # pyodide has scipy and matplotlib already

import pyhf
from pyhf.contrib.viz import brazil
import numpy as np
import matplotlib.pyplot as plt

model = pyhf.simplemodels.uncorrelated_background(
    signal=[5.0, 10.0], bkg=[50.0, 60.0], bkg_uncertainty=[5.0, 12.0]
)
observations = [53.0, 65.0] + model.config.auxdata

poi_values = np.linspace(0.1, 5, 50)
obs_limit, exp_limits, (scan, results) = pyhf.infer.intervals.upperlimit(
    observations, model, poi_values, level=0.05, return_results=True
)
print(f"Upper limit (obs): μ = {obs_limit:.4f}")
print(f"Upper limit (exp): μ = {exp_limits[2]:.4f}")

fig, ax = plt.subplots()
fig.set_size_inches(10.5, 7)
ax.set_title("Hypothesis Tests")

artists = brazil.plot_results(poi_values, results, ax=ax)
plt.show()

pyhf_webassm

(Running on https://numpy.org/)

matthewfeickert commented 2 years ago

As @lukasheinrich points out:

This is amazing for HEPData integration. Imagine a "prefit/posfit" stack plot visualization off the likelihood displayed right on HEPData.

kratsg commented 2 years ago

We can stick this in an iframe perhaps: https://jupyterlite.readthedocs.io/en/latest/_static/repl/index.html?toolbar=1&kernel=python&code=import%20micropip%20;%20await%20micropip.install(%22pyhf%22)%20;%20import%20pyhf

kratsg commented 2 years ago

Even better -- the example in the issue encoded in the URL and an example with altair working

matthewfeickert commented 2 years ago

Thanks to @kratsg's tip on urllib.parse here's a bit of code to generate a URL that will enable the replite toolbar as well as populating and running the first cell with the micropip install and imports:

# pyolite_src.py
import urllib.parse

code = """\
import micropip
await micropip.install(["pyhf", "requests"])
import pyhf\
"""

parsed_url = urllib.parse.quote(code)
# url_base = "https://replite.vercel.app/retro/consoles/?toolbar=1&kernel=python&code="  # old URL now 404s
url_base = "https://jupyterlite.readthedocs.io/en/latest/_static/repl/index.html?toolbar=1&kernel=python&code="

print(f"# replite URL:\n{url_base + parsed_url}")

lite-badge

(I grabbed the badge from the https://github.com/jupyterlite/jupyterlite README)

matthewfeickert commented 2 years ago

I'm trying to test things on TestPyPI where we might be able to get HEPData to work until we have v0.7.0 out but I'm hitting issues with installs, so I opened up this Issue: https://github.com/pyodide/pyodide/issues/2166

matthewfeickert commented 2 years ago

TestPyPI won't work because (as noted in https://github.com/pyodide/pyodide/issues/2166#issuecomment-1036860386)

This is a problem with headers. The server must set the header access-control-allow-origin: * on the response. files.python-hosted.org sets this header but test-files.python-hosted.org does not.

matthewfeickert commented 2 years ago

Looks like the https://replite.vercel.app/retro/consoles/?toolbar=1 URLs are now 404ing and we should be switching to using https://jupyterlite.readthedocs.io/en/latest/_static/repl/index.html?toolbar=1 as a base URL.

(I'll go and update the URLs above to keep the examples working.)

matthewfeickert commented 2 years ago

We can stick this in an iframe perhaps

@kratsg was thinking ahead. :+1:

Today I read Jeremy Tuloup's Jupyter Everywhere blog post that he also put out today (2022-03-15) and with just

$ git diff origin/master 
diff --git a/README.rst b/README.rst
index 07fe617c..763f0cc0 100644
--- a/README.rst
+++ b/README.rst
@@ -32,6 +32,17 @@ to support modern computational graph libraries such as PyTorch and
 TensorFlow in order to make use of features such as autodifferentiation
 and GPU acceleration.

+Try out now with JupyterLite
+----------------------------
+
+.. raw:: html
+
+   <iframe
+      src="https://jupyterlite.github.io/demo/repl/index.html?kernel=python&toolbar=1&code=import%20micropip%0Aawait%20micropip.install%28%5B%22pyhf%22%2C%20%22requests%22%5D%29%0Aimport%20pyhf"
+      width="100%"
+      height="500px"
+   ></iframe>
+
 User Guide
 ----------

things are running in the docs:

in_your_docs_calculating_your_cls

kratsg commented 2 years ago

@kratsg was thinking ahead. 👍

not really. I just know the limitations of gh-pages and sphinx :)

matthewfeickert commented 2 years ago

The more interesting part of the article is about jupyterlite-sphinx. Though I haven't been able to get it working well yet.

matthewfeickert commented 2 years ago

I've also now come across why requests isn't working in pyodide: https://github.com/pyodide/pyodide/issues/529

So until that's fixed (probably won't be for quite some time as I think the pyodide team is already swamped and focusing on other areas currently) we'd need to acquire files from HEPData through an alternative method.

matthewfeickert commented 2 years ago

So until that's fixed (probably won't be for quite some time as I think the pyodide team is already swamped and focusing on other areas currently) we'd need to acquire files from HEPData through an alternative method.

Actually this got addressed 4 days ago in https://github.com/pyodide/pyodide/pull/2263! So that's exciting. Hopefully the next(?) release of pyodide (v0.20.0 in April https://github.com/pyodide/pyodide/issues/529#issuecomment-1068563138) will include this.

matthewfeickert commented 2 years ago

@kratsg @lukasheinrich to follow up on some discussions that we've had in chat: I currently have something similar to https://github.com/scikit-hep/pyhf/issues/1775#issuecomment-1068415788 implemented on the branch docs/embedd-jupyterlite-into-readme that has the following RTD render (c.f. PR #1820)

https://pyhf.readthedocs.io/en/docs-embedd-jupyterlite-into-readme/

I tried using jupyterlite-sphinx, but it isn't ready yet given https://github.com/jupyterlite/jupyterlite-sphinx/issues/37 and so we'll have to wait for that later on down the line.

The current thing to figure out is the following:

As PR #1820 is currently using an iframe inside of a

.. raw:: html

directive in the README the iframe won't render on a GitHub preview render of the README, or on PyPI for security reasons. This means that if we have text around the iframe explaining it that that text will show up on GitHub and PyPI with no iframe and seem strange and out of context, or like our docs are broken. It would be good to avoid this. (edit: This way won't work, as twine check dist/* will fail on the use of a "raw" directive in the long_description provided by the README)

The next question then is how to avoid this? Do we just include the directive with no supporting text and let it get ripped out/not rendered by GitHub and PyPI and assume that anyone visiting the docs that sees it will understand what it is without instructions? The NumPy team gives quite a bit of helpful info in theirs. Or do we stop using the README.rst as the landing page for our docs and then have a seperate page that is very similar to the README that actually gets rendered for the docs?

https://github.com/scikit-hep/pyhf/blob/569f51257c6a895508f9c026bd61a2e723cb339c/docs/index.rst?plain=1#L34

This spares PyPI, but means that we now have two documents to keep synced. Either that or we need to then start doing injections of the raw HTML directive into the README.rst at docs build time but that sounds like a messy idea.

Thoughts?

kratsg commented 2 years ago

Also link #1736 as it's related.

kratsg commented 2 years ago

This spares PyPI, but means that we now have two documents to keep synced. Either that or we need to then start doing injections of the raw HTML directive into the README.rst at docs build time but that sounds like a messy idea.

Why not just include it directly in docs/index.rst and not in the README? The README is what's shown for github and pypi, and it's injected into our docs. So we can add more in index.rst without screwing up the other places.

matthewfeickert commented 2 years ago

Why not just include it directly in docs/index.rst and not in the README? The README is what's shown for github and pypi, and it's injected into our docs. So we can add more in index.rst without screwing up the other places.

I had thought of that, but then where do you insert it? If you insert it just above the README

Try out now with Pyolite
------------------------

.. raw:: html

   <iframe
      src="https://jupyterlite.github.io/demo/repl/index.html?kernel=python&toolbar=1&code=import%20micropip%0Aawait%20micropip.install%28%5B%22pyhf%3D%3D0.6.3%22%2C%20%22requests%22%5D%29%0Aimport%20pyhf"
      width="100%"
      height="500px"
   ></iframe>

https://github.com/scikit-hep/pyhf/blob/569f51257c6a895508f9c026bd61a2e723cb339c/docs/index.rst?plain=1#L33-L34

then it is the very first thing on the page, which looks not great to my mind.

above_readme

I was thinking that the Pyolite window would be the first element that the docs visitor sees after they have seen the project logo and introductory text. So, the user:

I could be missing something easy here though.

kratsg commented 2 years ago

I could be missing something easy here though.

Splice it: https://stackoverflow.com/a/54519037/1532974

matthewfeickert commented 2 years ago

So until that's fixed (probably won't be for quite some time as I think the pyodide team is already swamped and focusing on other areas currently) we'd need to acquire files from HEPData through an alternative method.

Actually this got addressed 4 days ago in pyodide/pyodide#2263! So that's exciting. Hopefully the next(?) release of pyodide (v0.20.0 in April pyodide/pyodide#529 (comment)) will include this.

@henryiii has pointed out that this actually won't happen:

No it won’t. Requests doesn’t work unless someone rewrites the stdlib module in terms of browser APIs.

and also pointed us to https://github.com/pyodide/pyodide/issues/140.

So looks like I got overly excited and didn't read https://github.com/pyodide/pyodide/issues/529#issuecomment-962175588 carefully enough.

henryiii commented 2 years ago

And you can make it work (not using portable code though):

import pyodide
import json
inp = pyodide.open_url("https://raw.githubusercontent.com/scikit-hep/pyhf/master/docs/examples/json/2-bin_1-channel.json”)
wspace = pyhf.Workspace(json.load(inp))

(redirects are not supported)