cbyrohl / scida

scida is an out-of-the-box analysis tool for large scientific datasets. It primarily supports the astrophysics community, focusing on cosmological and galaxy formation simulations using particles or unstructured meshes, as well as large observational datasets. This tool uses dask, allowing analysis to scale.
https://scida.io
MIT License
26 stars 4 forks source link

Visualisation example errors #112

Closed kyleaoman closed 7 months ago

kyleaoman commented 9 months ago

Attempting to run the 2D histogram example in the visualization documentation results in an error. Initially I thought that bins=100 not being a 2-tuple was the issue, but trying with bins=(100, 100) or omitting this kwarg raises the same error

Python 3.10.0 (default, Dec 21 2021, 13:36:04) [GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dask.array as da
>>> from scida import load
Warning! Using default configuration. Please adjust/replace in '/home/koman/.config/scida/config.yaml'.
>>> import matplotlib.pyplot as plt
>>> ds = load("snapdir_030")
>>> dens = ds.data["PartType0"]["Density"]
>>> temp = ds.data["PartType0"]["Temperature"]
>>> hist, xedges, yedges = da.histogram2d(dens, temp, bins=100)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/koman/.local/lib/python3.10/site-packages/dask/array/routines.py", line 1119, in histogram2d
    counts, edges = histogramdd(
  File "/home/koman/.local/lib/python3.10/site-packages/dask/array/routines.py", line 1431, in histogramdd
    if all(isinstance(b, int) for b in bins) and all(len(r) == 2 for r in range):
TypeError: 'NoneType' object is not iterable
>>> hist, xedges, yedges = da.histogram2d(dens, temp, bins=(100, 100))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/koman/.local/lib/python3.10/site-packages/dask/array/routines.py", line 1119, in histogram2d
    counts, edges = histogramdd(
  File "/home/koman/.local/lib/python3.10/site-packages/dask/array/routines.py", line 1431, in histogramdd
    if all(isinstance(b, int) for b in bins) and all(len(r) == 2 for r in range):
TypeError: 'NoneType' object is not iterable
kyleaoman commented 9 months ago

I also get an error for the interactive visualization example:

Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import holoviews as hv
>>> import holoviews.operation.datashader as hd
>>> import datashader as dshdr
>>> from scida import load
Warning! Using default configuration. Please adjust/replace in '/home/koman/.config/scida/config.yaml'.
>>> ds = load('snapdir_030')
>>> ddf = ds.data["PartType0"].get_dataframe(["Coordinates0", "Coordinates1", "Masses"])  
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/koman/code/scida-joss/src/scida/fields.py", line 295, in get_dataframe
    dfs = [dd.from_dask_array(v, columns=[k]) for k, v in dss.items()]
  File "/home/koman/code/scida-joss/src/scida/fields.py", line 295, in <listcomp>
    dfs = [dd.from_dask_array(v, columns=[k]) for k, v in dss.items()]
  File "/home/koman/.local/lib/python3.10/site-packages/dask/dataframe/io/io.py", line 433, in from_dask_array
    meta = _meta_from_array(x, columns, index, meta=meta)
  File "/home/koman/.local/lib/python3.10/site-packages/dask/dataframe/io/io.py", line 64, in _meta_from_array
    meta = meta_lib_from_array(x).DataFrame()
  File "/home/koman/.local/lib/python3.10/site-packages/dask/utils.py", line 641, in __call__
    meth = self.dispatch(type(arg))
  File "/home/koman/.local/lib/python3.10/site-packages/dask/utils.py", line 635, in dispatch
    raise TypeError(f"No dispatch for {cls}")
TypeError: No dispatch for <class 'pint.Quantity'>
kyleaoman commented 9 months ago

In case it's a compatibility issue, here's my package listing:

anyio==4.1.0
apt-clone==0.2.1
apturl==0.5.2
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asciitree==0.3.3
astropy==5.3.4
asttokens==2.4.1
async-lru==2.0.4
attrs==23.1.0
Babel==2.13.1
beautifulsoup4==4.10.0
bleach==6.1.0
blinker==1.4
bokeh==3.3.2
Brlapi==0.8.3
certifi==2020.6.20
cffi==1.16.0
chardet==4.0.0
charset-normalizer==3.3.2
click==8.1.7
cloudpickle==3.0.0
colorama==0.4.4
colorcet==3.0.1
comm==0.2.0
command-not-found==0.3
configobj==5.0.6
contourpy==1.2.0
crudini==0.9.3
cryptography==3.4.8
cupshelpers==1.0
cycler==0.12.1
dask==2023.12.0
datashader==0.16.0
dbus-python==1.2.18
debugpy==1.8.0
decorator==5.1.1
defer==1.0.6
defusedxml==0.7.1
distlib==0.3.4
distributed==2023.12.0
distro==1.7.0
exceptiongroup==1.2.0
executing==2.0.1
eyeD3==0.8.10
fasteners==0.19
fastjsonschema==2.19.0
filelock==3.6.0
fonttools==4.46.0
fqdn==1.5.1
fsspec==2023.12.1
grpcio==1.30.2
h5py==3.10.0
holoviews==1.18.1
html5lib==1.1
httplib2==0.20.2
idna==3.3
ifaddr==0.1.7
IMDbPY==2021.4.18
importlib-metadata==7.0.0
iniconfig==2.0.0
iniparse==0.4
ipykernel==6.27.1
ipython==8.18.1
ipywidgets==8.1.1
isoduration==20.11.0
jedi==0.19.1
jeepney==0.7.1
Jinja2==3.1.2
json5==0.9.14
jsonpointer==2.4
jsonschema==4.20.0
jsonschema-specifications==2023.11.2
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.9.0
jupyter-lsp==2.2.1
jupyter_client==8.6.0
jupyter_core==5.5.0
jupyter_server==2.12.1
jupyter_server_terminals==0.4.4
jupyterlab==4.0.9
jupyterlab-widgets==3.0.9
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.2
keyring==23.5.0
kiwisolver==1.4.5
launchpadlib==1.10.16
lazr.restfulclient==0.14.4
lazr.uri==1.0.6
libevdev==0.5
linkify-it-py==2.0.2
llvmlite==0.40.1
locket==1.0.0
louis==3.20.0
lxml==4.8.0
macaroonbakery==1.3.1
Mako==1.1.3
Markdown==3.3.6
markdown-it-py==3.0.0
MarkupSafe==2.0.1
matplotlib==3.8.2
matplotlib-inline==0.1.6
mdit-py-plugins==0.4.0
mdurl==0.1.2
meson==0.61.2
mistune==3.0.2
more-itertools==8.10.0
msgpack==1.0.7
multipledispatch==1.0.0
nbclient==0.9.0
nbconvert==7.12.0
nbformat==5.9.2
nemo-emblems==5.8.0
nest-asyncio==1.5.8
netaddr==0.8.0
netifaces==0.11.0
notebook==7.0.6
notebook_shim==0.2.3
numba==0.57.1
numcodecs==0.12.1
numpy==1.24.4
oauthlib==3.2.0
onboard==1.4.1
overrides==7.4.0
packaging==21.3
PAM==0.4.2
pandas==2.1.4
pandocfilters==1.5.0
panel==1.3.4
param==2.0.1
parso==0.8.3
partd==1.4.1
pexpect==4.8.0
Pillow==9.0.1
Pint==0.22
platformdirs==2.5.1
pluggy==1.3.0
prometheus-client==0.19.0
prompt-toolkit==3.0.41
protobuf==3.12.4
psutil==5.9.0
ptyprocess==0.7.0
pure-eval==0.2.2
pycairo==1.20.1
pycparser==2.21
pycryptodomex==3.11.0
pyct==0.5.0
pycups==2.0.1
pycurl==7.44.1
pyelftools==0.27
pyerfa==2.0.1.1
Pygments==2.17.2
PyGObject==3.42.1
PyICU==2.8.1
pyinotify==0.9.6
PyJWT==2.3.0
pymacaroons==0.13.0
PyNaCl==1.5.0
pyparsing==2.4.7
pyparted==3.11.7
pyRFC3339==1.1
pytest==7.4.3
python-apt==2.4.0+ubuntu2
python-dateutil==2.8.2
python-debian==0.1.43+ubuntu1.1
python-gnupg==0.4.8
python-json-logger==2.0.7
python-magic==0.4.24
python-xlib==0.29
pytz==2022.1
pyudev==0.22.0
pyviz_comms==3.0.0
pyxattr==0.7.2
pyxdg==0.27
PyYAML==5.4.1
pyzmq==25.1.2
qtconsole==5.5.1
QtPy==2.4.1
referencing==0.32.0
reportlab==3.6.8
requests==2.31.0
requests-file==1.5.1
requests-unixsocket==0.2.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.13.2
# Editable install with no version control (scida==0.2.4)
-e /home/koman/code/scida-joss
scipy==1.11.4
scour==0.38.2
SecretStorage==3.3.1
Send2Trash==1.8.2
setproctitle==1.2.2
six==1.16.0
sniffio==1.3.0
sortedcontainers==2.4.0
soupsieve==2.3.1
stack-data==0.6.3
systemd-python==234
tblib==3.0.0
terminado==0.18.0
tinycss2==1.1.1
tldextract==3.1.2
tomli==2.0.1
toolz==0.12.0
tornado==6.4
tqdm==4.66.1
traitlets==5.14.0
types-python-dateutil==2.8.19.14
typing_extensions==4.8.0
tzdata==2023.3
ubuntu-advantage-tools==8001
ubuntu-drivers-common==0.0.0
uc-micro-py==1.0.2
ufw==0.36.1
Unidecode==1.3.3
uri-template==1.3.0
urllib3==1.26.5
virtualenv==20.13.0+ds
wadllib==1.3.6
wcwidth==0.2.12
webcolors==1.13
webencodings==0.5.1
websocket-client==1.7.0
widgetsnbextension==4.0.9
xarray==2023.12.0
xdg==5
xkit==0.0.0
xlrd==1.2.0
xyzservices==2023.10.1
youtube-dl==2021.12.17
zarr==2.16.1
zict==3.0.0
zipp==1.0.0
cbyrohl commented 9 months ago

Sorry about that. I will automate testing of doc examples in the futures. These problems will be fixed with PR #113. In the first case, the example has been adjusted, in the second case we now strip the units before passing the arrays to dask dataframes due to their lack of handling pint units (for now).

kyleaoman commented 9 months ago

The 2D histogram example now works for me.

The interactive visualisation example seems to be missing some code to actually make the plot appear (I'm running for CLI, not sure if it might appear automatically in a notebook). I'm not familiar with holoviews but I took a guess with hv.render(hd.dynspread(...)). This seemed likely to work but I get errors that look like version incompatibilities between numpy and holoviews. I tried monkey patching that (import warnings; np.warnings = warnings) and then got an incorrect call to something in the inspect core python module.

At a guess this might be because I'm trying to run in python3.11, which I see isn't in your test matrix. Might be worth specifying <=3.10 in the pyproject file? Or perhaps I'm just not using holoviews correctly...

cbyrohl commented 9 months ago

Interesting. Does it show the bokeh/holoview icons? image

No dedicated plotting/render command should be issued as far as I understand (see e.g. here: https://holoviews.org/getting_started/Introduction.html). I am using python 3.9, I also tested a new python 3.11.5 environment:

conda create -n scida_py311 python=3.11 -y
conda activate scida_py311
pip install git+https://github.com/cbyrohl/scida@bb351d9224d4cd095ccbe3873c2e98c63612e56b datashader holoviews
python -m ipykernel install --user --name scida_py311 --display-name "scida_py311"

The snippet for the interactive plotting works just fine* in this environment (also implying that there is no inherent problem with python 3.11). From my experience, bokeh applets not showing is related to the frontend environment that runs the jupyter server (which can be different from the ipython kernel). Could you check https://stackoverflow.com/a/46894021 ?

Once this is resolved, I will add an infobox linking to resources regarding such issue with bokeh/datashader.

*=even though I get the following additional output: %opts magic unavailable (pyparsing cannot be imported) %compositor magic unavailable (pyparsing cannot be imported)

kyleaoman commented 9 months ago

Ok, I have some more information. Up until now I was trying to work either from a basic python interactive session, or an ipython interactive session (without jupyter-notebook or jupyter-lab).

I confirm that with your proposed conda venv I can run the example without generating errors (but without generating expected output either). In python and ipython I don't get any visualisation appearing (perhaps unsurprisingly - https://github.com/holoviz/holoviews/issues/4434). I can however assign and then save the visualisation like this:

composition = hd.dynspread(shaded, threshold=0.9, max_px=50).opts(bgcolor="black", xaxis=None, yaxis=None, width=500, height=500, )
hv.save(composition, "test.html")

This does create a bokeh figure that shows the density field, however the dynamic loading as the image is panned or zoomed doesn't work in this case. Unclear to me whether this is the expected behaviour when using hv.save. I get a similar result if I do:

fig = hv.render(composition)
import bokeh.plotting
bokeh.plotting.show(fig)

That opens in a browser but doesn't do the dynamic loading demonstrated in the video in the tutorial.

If I run your snippet in a jupyter-lab or jupyter-notebook I get the bokeh and hv logos but no figure. In principle we should both be using the same versions of jupyter (installed with the venv), right? So I think that the frontends should be the same? I tried a bit of troubleshooting like conda install jupyter_bokeh but this promptly led to what looked like lots of conflicts between packages and APIs. Not quite sure where to go from here. I'm not a heavy notebook/lab user so debugging this is a bit beyond what I want to attempt. Happy to try testing further sandboxed solutions on my systems if that's helpful...

Overall for the visualisation examples page I suggest signposting to holoviews & bokeh docs as needed, and making it clear whether you expect the user to run the example in python/ipython/jupyter-notebook/jupyter-lab.

I can't seem to reproduce the numpy+holoviews conflicts that I mentioned before on fresh virtual environment installations, even with various permutations. I must have inadvertently created some conflict through subsequent package installs in the venv I was working on before (that I'd created specifically for this review, so it was supposed to be relatively "clean"). I think that issue can be dropped.

cbyrohl commented 9 months ago

This does create a bokeh figure that shows the density field, however the dynamic loading as the image is panned or zoomed doesn't work in this case. Unclear to me whether this is the expected behaviour when using hv.save.

Good point. I would generally not assume that hv.save(), allows live data interaction.

If I run your snippet in a jupyter-lab or jupyter-notebook I get the bokeh and hv logos but no figure. In principle we should both be using the same versions of jupyter (installed with the venv), right?

Above snippet just uses the environment for the ipython kernel, not for the front end. The kernel and jupyter server can be entirely different versions. (I start the jupyter server from another environment). For better comparison, I now start the jupyter server from above conda enviroment: The interactive plotting works as expected there as well. Potentially this could have something to do with the jupyter configuration and plugins [which are not isolated in the env]?

Overall for the visualisation examples page I suggest signposting to holoviews & bokeh docs as needed, and making it clear whether you expect the user to run the example in python/ipython/jupyter-notebook/jupyter-lab.

I agree: I think the interactive mode should be expected to only work with jupyter-notebook and jupyter-lab - once we fix your issue.

Could you try running this notebook?This would determine whether this has anything to do with scida at all.

cbyrohl commented 7 months ago

I will close this issue shortly. I fixed the original two errors in PR #113 and have added a link to the holoviews tutorial page to try out first before attempting to execute the scida example.