executablebooks / jupyter-cache

A defined interface for working with a cache of executed jupyter notebooks
https://jupyter-cache.readthedocs.io
MIT License
51 stars 14 forks source link

Migrating cached notebooks to >= 0.5.0 #113

Open davidvandebunte opened 10 months ago

davidvandebunte commented 10 months ago

Describe the bug

Thanks for this great project and for Jupyter Book! Obviously many of the same people work on both projects.

After upgrading to a recent version of this package, I'm getting the error:

Exception occurred:
  File "/opt/conda/lib/python3.11/site-packages/jupyter_cache/cache/db.py", line 62, in session_context
    raise RuntimeError(
RuntimeError: Unexpected error accessing jupyter cache, it may need to be cleared.
The full traceback has been saved in /tmp/sphinx-err-npmhkzbj.log, if you want to report the issue to the developers.
Please also report this if it was a user error, so that a better error message can be provided next time.
A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks!
Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1969, in _exec_single_context
    self.dialect.do_execute(
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 922, in do_execute
    cursor.execute(statement, parameters)
sqlite3.OperationalError: no such table: nbproject

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/site-packages/jupyter_cache/cache/db.py", line 59, in session_context
    yield session
  File "/opt/conda/lib/python3.11/site-packages/jupyter_cache/cache/db.py", line 212, in create_record
    session.commit()
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1969, in commit
    trans.commit(_to_root=True)
  File "<string>", line 2, in commit
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1256, in commit
    self._prepare_impl()
  File "<string>", line 2, in _prepare_impl
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1231, in _prepare_impl
    self.session.flush()
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4312, in flush
    self._flush(objects)
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4447, in _flush
    with util.safe_reraise():
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 146, in __exit__
    raise exc_value.with_traceback(exc_tb)
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4408, in _flush
    flush_context.execute()
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/orm/unitofwork.py", line 466, in execute
    rec.execute(self)
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/orm/unitofwork.py", line 642, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/orm/persistence.py", line 93, in save_obj
    _emit_insert_statements(
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/orm/persistence.py", line 1227, in _emit_insert_statements
    result = connection.execute(
             ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1416, in execute
    return meth(
           ^^^^^
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 517, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1639, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1848, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1988, in _exec_single_context
    self._handle_dbapi_exception(
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2344, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1969, in _exec_single_context
    self.dialect.do_execute(
  File "/opt/conda/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 922, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: nbproject
[SQL: INSERT INTO nbproject (uri, read_data, assets, created, traceback) VALUES (?, ?, ?, ?, ?)]
[parameters: ('/home/jovyan/work/notes/ssc/7-3.md', '{"type": "plugin", "name": "myst_nb_md"}', '[]', '2024-01-25 13:55:36.499957', '')]
(Background on this error at: https://sqlalche.me/e/20/e3q8)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/site-packages/jupyter_book/sphinx.py", line 171, in build_sphinx
    app.build(force_all, filenames)
  File "/opt/conda/lib/python3.11/site-packages/sphinx/application.py", line 329, in build
    self.builder.build_update()
  File "/opt/conda/lib/python3.11/site-packages/sphinx/builders/__init__.py", line 288, in build_update
    self.build(to_build,
  File "/opt/conda/lib/python3.11/site-packages/sphinx/builders/__init__.py", line 302, in build
    updated_docnames = set(self.read())
                           ^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/sphinx/builders/__init__.py", line 409, in read
    self._read_serial(docnames)
  File "/opt/conda/lib/python3.11/site-packages/sphinx/builders/__init__.py", line 430, in _read_serial
    self.read_doc(docname)
  File "/opt/conda/lib/python3.11/site-packages/sphinx/builders/__init__.py", line 483, in read_doc
    publisher.publish()
  File "/opt/conda/lib/python3.11/site-packages/docutils/core.py", line 217, in publish
    self.document = self.reader.read(self.source, self.parser,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/sphinx/io.py", line 103, in read
    self.parse()
  File "/opt/conda/lib/python3.11/site-packages/docutils/readers/__init__.py", line 78, in parse
    self.parser.parse(self.input, document)
  File "/opt/conda/lib/python3.11/site-packages/myst_nb/sphinx_.py", line 150, in parse
    with create_client(
  File "/opt/conda/lib/python3.11/site-packages/myst_nb/core/execute/base.py", line 83, in __enter__
    self.start_client()
  File "/opt/conda/lib/python3.11/site-packages/myst_nb/core/execute/cache.py", line 55, in start_client
    stage_record = cache.add_nb_to_project(str(self.path), read_data=read_fmt)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/jupyter_cache/cache/main.py", line 422, in add_nb_to_project
    return NbProjectRecord.create_record(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/jupyter_cache/cache/db.py", line 208, in create_record
    with session_context(db) as session:  # type: Session
  File "/opt/conda/lib/python3.11/contextlib.py", line 158, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/opt/conda/lib/python3.11/site-packages/jupyter_cache/cache/db.py", line 62, in session_context
    raise RuntimeError(
RuntimeError: Unexpected error accessing jupyter cache, it may need to be cleared.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/bin/jupyter-book", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/jupyter_book/cli/main.py", line 317, in build
    builder_specific_actions(
  File "/opt/conda/lib/python3.11/site-packages/jupyter_book/cli/main.py", line 525, in builder_specific_actions
    raise RuntimeError(_message_box(msg, color="red", doprint=False)) from result
RuntimeError:
===============================================================================

There was an error in building your book. Look above for the cause.

===============================================================================

Reproduce the bug

It seems likely this is a known issue. See jupyter-cache/CHANGELOG.md and the breaking change mention of nbstage and nbproject.

Unfortunately I've got many notebooks that take hours to execute and use R packages that don't produce completely reliable i.e. reproducible results 🙄 (see e.g. Practice: Chp. 14).

I've not found a way to avoid re-execution and so for now I'm planning to attempt to rebuild the cache over a day or two. I'd just like to confirm there's likely no other way around this problem.

List your environment

$ jupyter-book --version
Jupyter Book      : 0.15.1
External ToC      : 0.3.1
MyST-Parser       : 0.18.1
MyST-NB           : 0.17.2
Sphinx Book Theme : 1.0.1
Jupyter-Cache     : 0.6.1
NbClient          : 0.7.4
welcome[bot] commented 10 months ago

Thanks for opening your first issue here! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.
If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).
Welcome to the EBP community! :tada:

agoose77 commented 10 months ago

@davidvandebunte before you re-execute, I think it should be possible to manually upgrade the database. From the CHANGELOG, it appears that the table name changed, and the schema includes some new fields. I think if we make those changes by hand, we should be in luck.

davidvandebunte commented 10 months ago

Thanks @agoose77! As you suggested it wasn't all that difficult:

ALTER TABLE `nbstage` RENAME TO `nbproject`;
ALTER TABLE nbproject ADD COLUMN read_data JSON;
ALTER TABLE nbproject ADD COLUMN exec_data JSON;

The schema doesn't match perfectly: so far I've noticed that read_data should be JSON NOT NULL and the order of the columns is different than in a regenerated cache. Still, this seems to be good enough so far.

agoose77 commented 10 months ago

Ah, you're quicker than me! For others, I created an alembic project to do this automatically: https://github.com/executablebooks/jupyter-cache-upgrader/tree/main/alembic

As ever, please please please back up the cache before trying this or any other surgery!