Open flamingbear opened 1 month ago
Both html repr and the normal repr are struggling with large datatrees. The normal repr should probably be truncated in similar fashion as the dataset repr: https://github.com/pydata/xarray/blob/ce5130f39d780cdce87366ee657665f4a5d3051d/xarray/core/options.py#L67
With this example the html repr takes 3 minutes compared to the 840ms from the normal repr:
import numpy as np
import xarray as xr
from xarray.core.datatree import DataTree
def create_datatree(number_of_files, number_of_groups, number_of_variables):
datasets = {}
for f in range(number_of_files):
for g in range(number_of_groups):
# Create random data:
time = np.linspace(0, 50 + f, 100 + g)
y = f * time + g
# Create dataset:
ds = xr.Dataset(
data_vars={
f"temperature_{g}{i}": ("time", y)
for i in range(number_of_variables // number_of_groups)
},
coords={"time": ("time", time)},
).chunk()
# Prepare for Datatree:
name = f"file_{f}/group_{g}"
datasets[name] = ds
dt = DataTree.from_dict(datasets)
return dt
number_of_files = 25
number_of_groups = 20
number_of_variables = 2000
dt = create_datatree(number_of_files, number_of_groups, number_of_variables)
# %timeit dt._repr_html_()
# 3min 15s ± 4.37 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
# %timeit dt.__repr__()
# 840 ms ± 29.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
What is your issue?
Originally posted by @TomNicholas in https://github.com/xarray-contrib/datatree/issues/206
@andersy005, @jbusecke and I noticed that for big trees (hundreds or thousands of nodes) the HTML repr can become very slow to render, potentially locking up your jupyter notebook.
We think that's because the HTML representing the whole tree is pre-rendered in one go, and hidden by defaulting sections to be closed. If your tree contains thousands of nodes that's a lot of HTML to render.
@andersy005 suggested that perhaps the HTML repr should contain some kind of callback, so that the code to render new nodes is only opened
I don't know if that's possible at all, or whether it would work for reprs rendered in non-interactive environments (such as in xarray's static docs pages).