deshaw / jupyterlab-execute-time

A JupyterLab extension for displaying cell timings
BSD 3-Clause "New" or "Revised" License
367 stars 48 forks source link

Use with git gives messed git diffs #130

Open grzegorz700 opened 1 day ago

grzegorz700 commented 1 day ago

When we use this extension with git-based sytems it produces 10(5*2) lines of diffs per every changed cell.


   "metadata": {
    "execution": {
  -   "iopub.execute_input": "2024-10-14T13:10:30.905308Z",
  -   "iopub.status.busy": "2024-10-14T13:10:30.904740Z",
  -   "iopub.status.idle": "2024-10-14T13:10:30.908169Z",
  -   "shell.execute_reply": "2024-10-14T13:10:30.907722Z",
  -   "shell.execute_reply.started": "2024-10-14T13:10:30.905290Z"
  +   "iopub.execute_input": "2024-10-14T19:16:26.414571Z",
  +   "iopub.status.busy": "2024-10-14T19:16:26.413960Z",
  +   "iopub.status.idle": "2024-10-14T19:16:26.417570Z",
  +   "shell.execute_reply": "2024-10-14T19:16:26.417137Z",
  +   "shell.execute_reply.started": "2024-10-14T19:16:26.414551Z"
    }
   },

This problem is well known without any perfect solution. Based on the many reference solutions, including the list from stackoverflow and a good advice https://github.com/jupyterlab/jupyterlab/issues/9444#issuecomment-743992307 and other stackoverflow solutions. I propose my partial workaround setup.

Partial workaround:

We could use this extension without massive diffs is based on two stages:

Prevent from pushing to git.

  1. Create or Edit a .gitattributes file in the root of your repository:
    touch .gitattributes
  2. Add the following line to the .gitattributes file:
    *.ipynb filter=clean_meta_ipynb
  3. Run:
    git config filter.clean_meta_ipynb.clean "jupyter nbconvert --to notebook --stdin --stdout --ClearMetadataPreprocessor.enabled=True"

    Prevent from displaying diffs in jupyterlab-git:

  4. Check where are your nbtime configs (with file name nbdime_config.json):
    jupyter --paths
  5. Create or update your nbdime_config.json (e.g. ~/.jupyter/nbdime_config.json)
  6. Add the following lines to them:
    {
    "NbDiff": {
      "Ignore": {
        "/metadata": true,
        "/cells/*/metadata": true
      }
    },
    "Extension": {
      "Ignore": {
        "/metadata": true,
        "/cells/*/metadata": true
      }
    }
    }

    or we could try with the more precise exclusion like: "/cells/*/metadata":['execution'].

  7. Restart jupyter lab

Drawbacks:

I put it that solution, especially for people who want to use this extension without the need to remove other info from notebooks (e.g. outputs).

However, I would love to see a better solution.

mlucool commented 1 day ago

This question is maybe better focused outside this plugin, but have you tried https://github.com/deshaw/nbstripout-fast? With this, nbdime does not show timestamps diff nor commit them.