sgkit-dev / sgkit

Scalable genetics toolkit
https://sgkit-dev.github.io/sgkit
Apache License 2.0
235 stars 32 forks source link

Use vcf2zarr in GWAS tutorial notebook #1258

Closed tomwhite closed 2 months ago

tomwhite commented 2 months ago
tomwhite commented 2 months ago

The docs build works for me locally - with the exact same set of Python package versions - but on CI it's failing with

Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/jupyter_cache/executors/utils.py", line 58, in single_nb_execution
    executenb(
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/nbclient/client.py", line 1314, in execute
    return NotebookClient(nb=nb, resources=resources, km=km, **kwargs).execute()
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/jupyter_core/utils/__init__.py", line 165, in wrapped
    return loop.run_until_complete(inner)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/nbclient/client.py", line 709, in async_execute
    await self.async_execute_cell(
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/nbclient/client.py", line 1062, in async_execute_cell
    await self._check_raise_for_error(cell, cell_index, exec_reply)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/nbclient/client.py", line 918, in _check_raise_for_error
    raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
ds = sg.load_dataset("1kg.vcz")
------------------

and

FileNotFoundError: [Errno 2] No such file or directory: '/home/runner/work/sgkit/sgkit/docs/examples/1kg.vcz/.zmetadata'
tomwhite commented 2 months ago

Thanks @jeromekelleher. Here's the new notebook (updates in the "Importing data from VCF" section):

https://github.com/sgkit-dev/sgkit/blob/7094d3cf192dfc25ff69456ec7f1e71e7df2c264/docs/examples/gwas_tutorial.ipynb

I'll merge it later today.

jeromekelleher commented 2 months ago

LGTM - short and sweet!