jupyter / nbdime

Tools for diffing and merging of Jupyter notebooks.
http://nbdime.readthedocs.io
Other
2.66k stars 159 forks source link

Memory consumption #473

Open lc0 opened 5 years ago

lc0 commented 5 years ago

I run nbdiff for relatively not big file(~1.7mb) and memory goes up to ~3.5GB

What is kinda surprising, given that number of cells, that have differences is not huge.

I wonder if it make sense to do some local diffs for non-exact cells? PS did not read much of code, so might sound not very smart 🙈

lc0 commented 5 years ago

Potentially this library looks quite fast to resolve diff

https://github.com/google/diff-match-patch

vidartf commented 5 years ago

That sounds like a lot. Is this a notebook you are able to share? It could be very useful for profiling. Depending on where the issue is, likely improvements are:

vidartf commented 5 years ago

Note: The actual library used for doing text diffs are unlikely to affect this issue, but that should of course be considered as well.

vidartf commented 3 years ago

Closing due to missing repro.

afeld commented 3 years ago

I have a file where use of nbdiff seems to grow without bound. To reproduce:

  1. Download https://github.com/afeld/python-public-policy/blob/129d5150e1796ecde2c947b4694c3430f388c8a1/lecture_3.ipynb
  2. Run nbdiff lecture_3.ipynb lecture_3.ipynb