NASA-IMPACT / veda-jupyterhub

VEDA JupyterHub technical planning and documentation
1 stars 1 forks source link

Markdown lists incorrectly formatted with checkboxes #14

Closed batpad closed 7 months ago

batpad commented 7 months ago

Steps to reproduce:

Login to https://staging.hub.openveda.cloud/ and start a new notebook.

Create a Markdown cell with the following content:

## Some heading
 - a list
 - but
 - not
 - a checklist

Expected behaviour: Get a normally formatted list (with bullets to denote list items)

Actual behaviour: List items all have check-boxes prepended to them.

A normal list should be formatted as a normal list and not have checkboxes.

This seems to be a recent regression / to do with recent updates. We are running JupyterLab v4.1.4 now - however, running the same version of JupyterLab locally, I was not able to reproduce, so this is not the version of JupyterLab.

Thanks to @jsignell for reporting.

I guess next steps here is probably:

This is definitely strange since it's not some conflicting CSS - the markdown formatter is creating <input type="checkbox"> elements in the HTML, so some-how the Markdown formatter is producing wrong output. I'll need to dig a bit more to figure out what could possibly be causing this.

cc @wildintellect

jsignell commented 7 months ago

Thanks for writing it up! Another part of it is that the ## title shows up twice.

Things looked normal with this image: public.ecr.aws/nasa-veda/nasa-veda-singleuser:2024-03-07 and started looking bad with this one: public.ecr.aws/nasa-veda/nasa-veda-singleuser:2024-03-20

That change came in on this PR: https://github.com/2i2c-org/infrastructure/pull/3823

batpad commented 7 months ago

The problem seems to exist in the upstream pangeo notebook pangeo/pangeo-notebook:2024.03.13

What I did to verify this:

Choose the Bring Your Own Image after logging into JupyterHub and entering pangeo/pangeo-notebook:2024.03.13 . Once in the notebook, I confirmed it was using the image expected by running:

import os
os.environ['JUPYTER_IMAGE_SPEC']

I guess I probably need to file this issue upstream? @wildintellect

batpad commented 7 months ago

Going to drop notes here as I try and narrow this down -

Going through the pangeo-notebook image tags and trying each one on the staging hub, I see this worked fine in 2024.02.02 and breaks in 2024.02.21 .

Now doing a diff of the packages.txt for each of those versions. The diff looks like:

7c7
< adlfs==2023.12.0
---
> adlfs==2024.2.0
9,10c9,10
< aiobotocore==2.11.1
< aiohttp==3.9.1
---
> aiobotocore==2.11.2
> aiohttp==3.9.3
24c24
< astropy-iers-data==0.2024.1.22.0.30.30
---
> astropy-iers-data==0.2024.2.19.0.28.47
44c44
< awscli==2.15.15
---
> awscli==2.15.22
46c46
< azure-core==1.29.7
---
> azure-core==1.30.0
58c58
< black==24.1.0
---
> black==24.2.0
64,65c64,65
< boto3==1.33.13
< botocore==1.33.13
---
> boto3==1.34.34
> botocore==1.34.34
68c68
< branca==0.7.0
---
> branca==0.7.1
75c75
< c-blosc2==2.13.1
---
> c-blosc2==2.13.2
77c77
< ca-certificates==2023.11.17
---
> ca-certificates==2024.2.2
85c85
< certifi==2023.11.17
---
> certifi==2024.2.2
87c87
< cf_xarray==0.8.8
---
> cf_xarray==0.9.0
111,113c111,113
< cytoolz==0.12.2
< dask==2024.1.1
< dask-core==2024.1.1
---
> cytoolz==0.12.3
> dask==2024.2.0
> dask-core==2024.2.0
120c120
< debugpy==1.8.0
---
> debugpy==1.8.1
124c124
< distributed==2024.1.1
---
> distributed==2024.2.0
131c131
< eccodes==2.33.0
---
> eccodes==2.34.0
135,136c135,136
< esmf==8.4.2
< esmpy==8.4.2
---
> esmf==8.6.0
> esmpy==8.6.0
140c140
< fastapi==0.109.0
---
> fastapi==0.109.2
147c147
< flox==0.9.0
---
> flox==0.9.2
156c156
< fonttools==4.47.2
---
> fonttools==4.49.0
170c170
< gdal==3.8.3
---
> gdal==3.8.4
174,175c174,175
< geopandas==0.14.2
< geopandas-base==0.14.2
---
> geopandas==0.14.3
> geopandas-base==0.14.3
179c179
< geoviews-core==1.11.0
---
> geoviews-core==1.11.1
182c182
< gh==2.42.1
---
> gh==2.43.1
187c187
< gitpython==3.1.41
---
> gitpython==3.1.42
191,192c191,192
< google-api-core==2.15.0
< google-auth==2.27.0
---
> google-api-core==2.17.1
> google-auth==2.28.1
215c215
< holoviews==1.18.1
---
> holoviews==1.18.3
217,219c217,219
< httpcore==1.0.2
< httpx==0.26.0
< hvplot==0.9.1
---
> httpcore==1.0.4
> httpx==0.27.0
> hvplot==0.9.2
224c224
< imageio==2.33.1
---
> imageio==2.34.0
229c229
< intake==0.7.0
---
> intake==2.0.1
235c235
< ipykernel==6.29.0
---
> ipykernel==6.29.2
241c241
< ipywidgets==8.1.1
---
> ipywidgets==8.1.2
244c244
< jasper==4.1.2
---
> jasper==4.2.1
250c250
< json5==0.9.14
---
> json5==0.9.17
270c270
< jupyterlab==4.0.11
---
> jupyterlab==4.1.2
272c272
< jupyterlab-myst==2.1.0
---
> jupyterlab-myst==2.3.1
275,276c275,276
< jupyterlab_server==2.25.2
< jupyterlab_widgets==3.0.9
---
> jupyterlab_server==2.25.3
> jupyterlab_widgets==3.0.10
279c279
< kerchunk==0.2.2
---
> kerchunk==0.2.3
310c310
< libdrm==2.4.114
---
> libdrm==2.4.120
317a318
> libgcrypt==1.10.3
319c320
< libgdal==3.8.3
---
> libgdal==3.8.4
322c323,324
< libglib==2.78.3
---
> libgirepository==1.78.1
> libglib==2.78.4
325a328
> libgpg-error==1.47
355c358
< libpciaccess==0.17
---
> libpciaccess==0.18
357,358c360,361
< libpng==1.6.39
< libpq==16.1
---
> libpng==1.6.42
> libpq==16.2
363a367
> libsecret==0.18.8
367c371
< libsqlite==3.44.2
---
> libsqlite==3.45.1
383c387
< libxml2==2.12.4
---
> libxml2==2.12.5
389c393
< linkify-it-py==2.0.2
---
> linkify-it-py==2.0.3
391d394
< lmoments3==1.0.6
397c400
< mako==1.3.1
---
> mako==1.3.2
401,402c404,405
< markupsafe==2.1.4
< matplotlib-base==3.8.2
---
> markupsafe==2.1.5
> matplotlib-base==3.8.3
411c414
< morecantile==5.2.2
---
> morecantile==5.3.0
413c416
< mpich==4.1.2
---
> mpich==4.2.0
415c418
< msal_extensions==1.0.0
---
> msal_extensions==1.1.0
417,418c420,421
< multidict==6.0.4
< multimethod==1.9.1
---
> multidict==6.0.5
> multimethod==1.11
424c427
< nbconvert-core==7.14.2
---
> nbconvert-core==7.16.1
428c431
< nbstripout==0.6.1
---
> nbstripout==0.7.1
430c433
< nccl==2.19.4.1
---
> nccl==2.20.3.1
438,439c441,442
< notebook==7.0.7
< notebook-shim==0.2.3
---
> notebook==7.1.0
> notebook-shim==0.2.4
441c444
< nss==3.97
---
> nss==3.98
443c446
< numbagg==0.7.1
---
> numbagg==0.8.0
445c448
< numpy==1.26.3
---
> numpy==1.26.4
448c451
< ocl-icd==2.3.1
---
> ocl-icd==2.3.2
451c454
< odc-stac==0.3.8
---
> odc-stac==0.3.9
454c457
< openssl==3.2.0
---
> openssl==3.2.1
463,464c466,467
< pangeo-dask==2024.01.28
< pangeo-notebook==2024.01.28
---
> pangeo-dask==2024.02.21
> pangeo-notebook==2024.02.21
468c471
< parcels==3.0.1
---
> parcels==3.0.2
479,480c482,483
< pip==23.3.2
< pixman==0.43.0
---
> pip==24.0
> pixman==0.43.2
482c485
< platformdirs==4.1.0
---
> platformdirs==4.2.0
484c487
< pooch==1.8.0
---
> pooch==1.8.1
486c489
< poppler==23.12.0
---
> poppler==24.02.0
489c492
< postgresql==16.1
---
> postgresql==16.2
492c495
< prometheus_client==0.19.0
---
> prometheus_client==0.20.0
506a510
> pycairo==1.26.0
511,512c515,516
< pydantic==2.5.3
< pydantic-core==2.14.6
---
> pydantic==2.6.1
> pydantic-core==2.16.2
515a520
> pygobject==3.46.0
517c522
< pykdtree==1.3.10
---
> pykdtree==1.3.11
520c525
< pyorbital==1.8.1
---
> pyorbital==1.8.2
523c528
< pyresample==1.27.1
---
> pyresample==1.28.1
529,530c534,535
< pytest==8.0.0
< python==3.11.7
---
> pytest==8.0.1
> python==3.11.8
536c541
< python-geotiepoints==1.7.1
---
> python-geotiepoints==1.7.2
541c546
< python-tzdata==2023.4
---
> python-tzdata==2024.1
545c550
< pytz==2023.3.post1
---
> pytz==2024.1
553c558
< rdma-core==49.0
---
> rdma-core==50.0
557c562
< referencing==0.32.1
---
> referencing==0.33.0
562c567
< rio-cogeo==5.1.1
---
> rio-cogeo==5.2.0
565c570
< rpds-py==0.17.1
---
> rpds-py==0.18.0
572,573c577,578
< s3transfer==0.8.2
< satpy==0.46.0
---
> s3transfer==0.10.0
> satpy==0.47.0
575c580
< scikit-learn==1.4.0
---
> scikit-learn==1.4.1.post1
580,581c585,586
< setuptools==69.0.3
< shapely==2.0.2
---
> setuptools==69.1.0
> shapely==2.0.3
592,593c597,598
< sqlalchemy==2.0.25
< sqlite==3.44.2
---
> sqlalchemy==2.0.27
> sqlite==3.45.1
596c601
< starlette==0.35.0
---
> starlette==0.36.3
603,607c608,612
< threadpoolctl==3.2.0
< tifffile==2023.12.9
< tiledb==2.19.0
< tiledb-py==0.25.0
< timezonefinder==6.2.0
---
> threadpoolctl==3.3.0
> tifffile==2024.2.12
> tiledb==2.20.0
> tiledb-py==0.26.0
> timezonefinder==6.4.1
614,615c619,620
< tornado==6.3.3
< tqdm==4.66.1
---
> tornado==6.4
> tqdm==4.66.2
618,619c623,624
< trajan==0.5.1
< trollimage==1.22.2
---
> trajan==0.6.0
> trollimage==1.23.1
625,627c630,632
< tzcode==2023d
< tzdata==2023d
< uc-micro-py==1.0.2
---
> tzcode==2024a
> tzdata==2024a
> uc-micro-py==1.0.3
633c638
< uvicorn==0.27.0
---
> uvicorn==0.27.1
641c646
< widgetsnbextension==4.0.9
---
> widgetsnbextension==4.0.10
645,646c650,651
< xarray==2024.1.1
< xarray-datatree==0.0.13
---
> xarray==2024.2.0
> xarray-datatree==0.0.14
652c657
< xclim==0.46.0
---
> xclim==0.48.0
654c659
< xesmf==0.8.2
---
> xesmf==0.8.3
680a686
> yamale==4.0.4
683c689
< zarr==2.16.1
---
> zarr==2.17.0

There do seem to be some package updates there that could potentially be altering the markdown -> html generation, but I'll need to poke around the JupyterLab code a bit more to find exactly how the markdown -> html generation is being done, and then try and test locally and play around with package versions.

If anyone has any clues or better ideas, please let me know 😂 - cc @yuvipanda @ranchodeluxe @abarciauskas-bgse

yuvipanda commented 7 months ago

@batpad looks like https://github.com/executablebooks/jupyterlab-myst/issues/224 :)

batpad commented 7 months ago

Thanks @yuvipanda !

So that was a fun rabbit-hole. I have a PR up with a possible fix, but it's possible that this ideally needs a bit more involved fix: https://github.com/executablebooks/jupyterlab-myst/pull/228

Let's wait a day or two to see if it will be fixed soon upstream, and if not we can look at rolling back the version of jupyterlab-myst on our images?

batpad commented 7 months ago

The fix has been merged upstream! It may take a few more days for there to be a release up on PyPI - I'd recommend we just wait and then update our images when upstream has published the fix. @jsignell does that sound alright?

jsignell commented 7 months ago

we just wait and then update our images when upstream has published the fix

That sounds great! Thanks for taking this one on!

batpad commented 7 months ago

@jsignell unfortunately, the Headings getting repeated that you noticed turns out to be a separate issue, related to the same upstream update. I filed an issue for that here: https://github.com/executablebooks/jupyterlab-myst/issues/229 . It is quite specific in that it only affects Headings in the first cell of the notebook, and does not affect # headings. i.e. ## Test will display twice, but # Test will not.

batpad commented 7 months ago

jupyterlab-myst 2.3.2 has been published to PyPI: https://pypi.org/project/jupyterlab-myst/ this should fix the list formatting issue (though not the issue with headings in the first cell).

@jsignell @ranchodeluxe - do you know what best next steps are? Should we try and get the version updated in the pangeo-notebook base image and use it from there, or just patch our image with the latest version?

batpad commented 7 months ago

Seems like the new version did not make it to the last pangeo-notebook release (2024.03.30) - https://github.com/pangeo-data/pangeo-docker-images/blob/master/pangeo-notebook/packages.txt#L273

@ranchodeluxe let's quickly chat about this tomorrow and you can tell me what to do :-)

wildintellect commented 7 months ago

Pangeo seems to release almost weekly, I think we should do a PR to upstream.

batpad commented 7 months ago

The issue with markdown lists being formatted with checkboxes should now be fixed. For the issue of headings showing up twice in the first cell of notebooks, we have an upstream issue: https://github.com/executablebooks/jupyterlab-myst/issues/229 - that one seems less severe and I think it's okay to just track upstream? @jsignell if you prefer we also track it here, happy to open a separate issue for that one - and thanks again for all your work identifying this and helping test the fix!

jsignell commented 6 months ago

Totally fine with me to just track upstream