jupyter / nbformat

Reference implementation of the Jupyter Notebook format
http://nbformat.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
259 stars 153 forks source link

5.10.2 break server in certain circumstances (when `data_dir` was not yet created) #396

Closed krassowski closed 5 months ago

krassowski commented 5 months ago

https://github.com/jupyter/nbformat/pull/378 aimed to simplify NotebookNotary.data_dir. Previously the default NotebookNotary.data_dir of JupyterApp would be used as a default, simplifying a bit:

  def _data_dir_default(self):
      app = JupyterApp()
      app.initialize(argv=[])
      return app.data_dir

The default for JupyterApp.data_dir it is defined here as:

  def _data_dir_default(self) -> str:
      d = jupyter_data_dir()
      ensure_dir_exists(d, mode=0o700)
      return d

https://github.com/jupyter/nbformat/pull/378 changed the NotebookNotary.data_dir to:

    def _data_dir_default(self):
        return jupyter_data_dir()

Which notably does not include the call to ensure_dir_exists.

Further, if JupyterApp.data_dir is customized, this change will no longer be reflected in NotebookNotary.data_dir which I think is a breaking change.

krassowski commented 5 months ago

CC @cmd-ntrf @ivanov what do you think about reverting https://github.com/jupyter/nbformat/pull/378? While adding ensure_dir_exists would be easy, I am more worried about implications of JupyterApp.data_dir customization not being passed down to NotebookNotary.data_dir.

For context, we observed this failure on CI in jupyterlab: https://github.com/jupyterlab/jupyterlab/issues/15985

ivanov commented 5 months ago

Oh bummer. I think a simple revert would get us back to the previous behavior of JupyterApp.data_dir being passed down, but then it will also re-introduces the potential for ignoring of the JUPYTER_DATA_DIR environment variable, which jupyter_data_dir currently accommodates. And I think neither of these two paths accommodate a subclassing of JupyterApp that ends up using the NotebookNotary.

Should NotebookNotary, which currently subclasses LoggingConfigurable actually be subclassing JupyterApp?

Also independent of that, should it be the responsibility of jupyter_data_dir in jupyter_core to ensure that the directory exists? Can we pass an optional JupyterApp to jupyter_data_dir that uses it's data_dir if not set to default? Should we eliminate jupyter_data_dir function from jupyter_core altogether, to avoid this kind of dance?

krassowski commented 5 months ago

But the previous logic does call jupyter_data_dir in the end so why would it ignore JUPYTER_DATA_DIR? I do not see it.

krassowski commented 5 months ago

And I think neither of these two paths accommodate a subclassing of JupyterApp that ends up using the NotebookNotary.

Why is this a problem?

FYI my concern be was specifically with the new solution not respecting c.JupyterApp.data_dir traitlet while the old one did. If ones subclasses JupyterApp then it should not matter in my view but maybe I'm missing something

krassowski commented 5 months ago

Should NotebookNotary, which currently subclasses LoggingConfigurable actually be subclassing JupyterApp?

Maybe. I do not know enough about implications, and I suspect it might break something downstairs. One would need to dig though git and issues history to see why it is this way.

ivanov commented 5 months ago

But the previous logic does call jupyter_data_dir in the end so why would it ignore JUPYTER_DATA_DIR? I do not see it.

You're right, I've struck that part out.

And I think neither of these two paths accommodate a subclassing of JupyterApp that ends up using the NotebookNotary.

Why is this a problem?

Actually, after refreshing my memory by poking through jupyter_core and traitlets, specifically due to traitlets' Application being a SingletonConfigurable, I think I'm wrong there, too. If the subclass is initialized, the data_dir from the subclass will be used in the original code.

Should NotebookNotary, which currently subclasses LoggingConfigurable actually be subclassing JupyterApp?

Maybe. I do not know enough about implications, and I suspect it might break something downstairs. One would need to dig though git and issues history to see why it is this way.

Just looking at the previous MultipleInstanceError logic, I think that'll be the problem. JupyterApp (jupyter_core) inherits from Application (traitlets) inherits from SingletonConfigurable (traitlets) , so you cannot have multiple instances of that subclass.

ivanov commented 5 months ago

Thanks for identifying the issue and putting out the revert @krassowski !