osamuaoki / imediff

2/3 file merge tool (Ncurses, CLI)
GNU General Public License v2.0
23 stars 1 forks source link

call to curses.newpad fails with curses function returned NULL when trying to merge big files #10

Open karltk opened 3 days ago

karltk commented 3 days ago

I am trying to merge text-based, computer-generated meta files. The files are typically 1000-10000 lines long, and while most of the lines are 80-120 characters, there are a handful of lines that are around 60000 characters wide, I've enjoyed imediff's functionality to do this work for years, but now I believe the files have grown too big for imediff to work.

The above dimensions appear to overload the call to curses.newpad() inside thenew_textpad()method of theTextPadclass intui.py`:

 git mergetool _targets_mini/meta/meta
Merging:
_targets_mini/meta/meta

Normal merge conflict for '_targets_mini/meta/meta':
  {local}: modified file
  {remote}: modified file
Traceback (most recent call last):
  File "/usr/bin/imediff", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/lib/python3/dist-packages/imediff/__main__.py", line 324, in main
    display_instance.command_loop(tutorial=tutorial)
  File "/usr/lib/python3/dist-packages/imediff/tui.py", line 332, in command_loop
    curses.wrapper(self.gui_loop)
  File "/usr/lib/python3.12/curses/__init__.py", line 94, in wrapper
    return func(stdscr, *args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/imediff/tui.py", line 421, in gui_loop
    self.new_textpad()
  File "/usr/lib/python3/dist-packages/imediff/tui.py", line 617, in new_textpad
    self.textpad = curses.newpad(conth + 1, max(80, contw + 1))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_curses.error: curses function returned NULL
_targets_mini/meta/meta seems unchanged.
Was the merge successful [y/n]?                                       

In my case, conth = 12799 and contw = 57753 when this happens. Assuming a naive implementation that just allocates a block of memory conth * contw large, of 2 bytes per character, this would require about 1.4G of RAM, which is a lot for a text buffer, but easily within the capacity of modern computers.

Digging further into the curses.newpad() function I see that it calls newpad() in the curses library proper. In turn, this calls _nc_makenew which performs a dimension_limit() check. This relies on the bit width of the NCURSES_SIZE_T type. This is set by autoconf at configuration time, and defaults to int if the code is compiled to be reentrant and cf_dft_ordinate_type otherwise. The cf_dft_ordinate_type defaults to short in the end [ref].

Looking at the debian/rules for the libcurses6 package, it's not clear to me that the curses library is configured to be reentrant (--enable-reentrant) or to be used with pthreads (--with-pthread) which would, in turn, enable reentrancy.

I therefore suspect that the max width and height of the pad is capped to a signed short (32767), and that's why I'm running into this issue, despite my computer actually having enough memory for the buffer to be created. I don't see any easy way to fix this (writing a custom pad implementation isn't super-appealing...), but perhaps you do?

I am using imediff 2.8, but from what I can tell from the changelog, my problem probably persists also for 2.11, since the affected part of the code has not been touched.

osamuaoki commented 3 days ago

Great! Here is the situation:

I had a impression this bug was some kind of integer limitation.

I am not maintaining ncurses. If you can talk to the ncurses maintainer to fix this for good, Geat.

In the mean time, you can try 2.11 (get its source package and rebuild it in stable. It's buildable.)

You can use git rebase on auto split commits.

Osamu

osamuaoki commented 3 days ago

FYI: I set the number of lines to use auto split mode as:

MAX_LINES_IMEDIFF="16000"

To force auto-mode, please use "-a" option.

osamuaoki commented 3 days ago

I checked Debian package dependency.

osamuaoki commented 3 days ago

FYI: The latest 2.11 imediff came with git-ime internal logic handles rename and delete gracefully for multi-file split by file case. 2.8 was good only for add and modification cases.

Auto mode is for git-ime.

I use gitk to inspect each commit when using git rebase.

osamuaoki commented 3 days ago

If I extrapolate from your message, as for ABI, it seems ncurses needs to offer ABI7 and Python needs to be compiled against it. Looking at the lib-package name, Debian only offers ABI6 in sid.

If you want to address this issue, please contact curses and python core Debian maintainers.

osamuaoki commented 3 days ago

Let me record idea for imediff improvement:

--lines option: limit imediffprocessed lines in one curses screen.

Implementation idea in python:

Non-trivial but quite useful UI with existing ncurses with python code only. patch welcome.

osamuaoki commented 3 days ago

Hmmm... it's not too difficult. It's less than 10-20 lines of code. I just need to add scope jumping event logic.

diff --git a/src/imediff/initialize_args.py b/src/imediff/initialize_args.py
index 90dc990..50e6605 100644
--- a/src/imediff/initialize_args.py
+++ b/src/imediff/initialize_args.py
@@ -78,6 +78,13 @@ def initialize_args(logfile="imediff.log"):
     )
     pa.add_argument("--mode", "-m", action="store_true", help="Display mode column")
     pa.add_argument("--mono", action="store_true", help="Force monochrome display")
+    pa.add_argument(
+        "--scope",
+        "-s",
+        action="store",
+        default=10000,
+        help="scope of lines in one ncurses screen (default=10000)",
+    )
     pa.add_argument(
         "--sloppy", action="store_true", help="Allow one to save unresolved contents"
     )
diff --git a/src/imediff/tui.py b/src/imediff/tui.py
index 62535ea..2c99023 100644
--- a/src/imediff/tui.py
+++ b/src/imediff/tui.py
@@ -322,6 +322,8 @@ class TextPad(TextData):  # TUI data
         # Init from commandline/configuration parameters
         self.mode = args.mode
         self.mono = args.mono
+        self.pointer = 0
+        self.scope = args.scope
         if self.diff_mode == 2:
             self.color = confs["color_diff2"]
         else:
@@ -588,12 +590,26 @@ class TextPad(TextData):  # TUI data
             logger.debug("command-loop")
         return

+    def next_textpad_scope(self):
+        if (self.pointer + self.scope ) < len(self.opcodes):
+            self.pointer = self.pointer + self.scope
+            self.new_textpad()
+        else:
+            pass # popup notification
+
+    def prev_textpad_scope(self):
+        if (self.pointer - self.scope ) >= 0:
+            self.pointer = self.pointer - self.scope
+            self.new_textpad()
+        else:
+            pass # popup notification
+
     def new_textpad(self):
         """Create new curses textpad"""
         # pre-scan content to get big enough textpad size
         conth = 0  # content height
         contw = 0  # content width
-        for i in range(len(self.opcodes)):
+        for i in range(self.pointer, min(self.pointer + self.scope, len(self.opcodes))):
             self.set_row(i, conth)  # record textpad row position in chunk
             tag = self.get_tag(i)
             content = self.get_content(i)  # list()

When I find time, I will finish and test it.