dsanson / termpdf.py

A graphical pdf and epub reader that works inside the kitty terminal
MIT License
496 stars 30 forks source link

Linux support is broken #2

Closed rien333 closed 4 years ago

rien333 commented 4 years ago

Kitty works perfectly well under linux, so it makes sense for this project to also support linux. If I open a ~8 page pdf, my terminal goes blank, the termpdf process uses about 6GB of ram and 100% of a CPU core, while just showing a blank screen. Opening a big pdf file almost halted my computer, I think because it used close to 9GB of ram.

Not really a helpful issue description, but I will try to investigate why this happens when I have a little more time. Given that you probably are on macOS, you might want to install a VM to do some simple testing one day. I'm using kitty from git btw, on a fully updated Arch linux system, with the newest PyMuPDF.

Great initiative btw! Folllowing emacs pdf-tools in terms of features is a nice idea.

dsanson commented 4 years ago

Thanks for the report. That seems like a very bad outcome! As you surmise, I haven't tested in on Linux. I wonder if it is getting a bunch of extra escape codes on stdin, and those are getting processed as though they were key presses? I think my current strategy for reading escape codes on stdin (used for communicating with kitty about images), and distinguishing them from user input, is fragile, and reflects my ignorance about the right way to do this in Python. So maybe the script rapidly tried to page through every page in your PDF, converting each one, and loading it into kitty?

rien333 commented 4 years ago

Just ran termpdf.py in another linux terminal app (st), and, oddly enough, the same thing happens. Not really sure if that has anything to do with input processing, but possibly.

rien333 commented 4 years ago

Update (possibly an uninteresting for you though): termpdf doesn't use up an extreme amount of resources when ran in the st terminal anymore. Still happens in GNOME vte though.

I have some time tomorrow, maybe I'll have a look at what's the problem.

dsanson commented 4 years ago

I just pushed a rewrite of the whole thing. I think it should now do a better job of proactively capturing stray escape code responses from kitty, which I think was the cause of your problem.

rien333 commented 4 years ago

Nope, still the same, except for the fact that the page range is shown. There might still be something going wrong with escape codes, as running and closing termpdf creates weird escape code garbage when clicking on that terminal with the mouse. I'll try and investigate tomorrow.

rien333 commented 4 years ago

@dsanson Did some investigation.

The function where execution halts and memory consumption goes out of control is display_page. Specifically, it never reaches the body of write_chunked. Most likely, this has to do with the size of the pixmap object pix, which is passed as an argument to write_chunked. I suspect there is an error in the pixmap size calculation, because if I do

def display_page(self, bar, p, display=True):
    ...
    mat = mat.preRotate(self.rotation)
    pix = page.getPixmap(matrix = mat, alpha=self.alpha)
    print(type(pix.samples))

it takes about 30 seconds before it prints out something like class 'bytes', while my whole 16GB (!) ram and some swap fills up. This print statement does eventually finish, however. Additationally, the dimensions of the pixmap seem completely out of proportion. If I run:

def display_page(self, bar, p, display=True):
    ...
    # build cmd to send to kitty
    cmd = {'i': p + 1, 't': 'd', 's': pix.width, 'v': pix.height}
    print(cmd)

it outputs: {'i': 1, 't': 'd', 's': 49107, 'v': 63550, 'f': 24}. I do have a High DPI screen (3000x2000), but such a large width and height seem excessive.

Some additional data points

Maybe something goes wrong during the zoom factor calculation?

# get zoomed and rotated pixmap
mat = fitz.Matrix(factor, factor)
print(mat)

outputs: Matrix(80.23989898989899, 0.0, 0.0, 80.23989898989899, 0.0, 0.0)

edit: seems like it. dh = scr.height - scr.cell_height comes down to dh == 63550.

rien333 commented 4 years ago

Basically, the width and height of my terminal screen are calculated incorrectly for some reason. In the get_size method of the Screen class, fcntl.ioctl(fd, termios.TIOCGWINSZ, buf) outputs a width and height of 65535 65535 (which happens to be the largest value an unsigned short can hold).

If run the same code in a python interpreter, however, it outputs ~relatively normal values~ (EDIT: only from time to time, often I get 65535):

>>> import array
>>> import sys
>>> buf = array.array('H', [0, 0, 0, 0])
>>> import fcntl
>>> import termios
>>> fd = sys.stdout
>>> fcntl.ioctl(fd, termios.TIOCGWINSZ, buf)
0
>>> print(buf)
array('H', [16, 92, 1196, 432])

~Maybe we're dealing with a kitty bug?~ Yeah okay, this seems like a kitty bug:

$ kitty icat --print-window-size
65535x65535
rien333 commented 4 years ago

Opened a bug here https://github.com/kovidgoyal/kitty/issues/2026

rien333 commented 4 years ago

The problem seems to be that fish shell screws around with TIOCGWINSZ. If run termpdf.py in a fresh bash instance it works fine.

dsanson commented 4 years ago

Did you ever resolve this problem? On my old laptop, I had been using termpdf.py within fish without problems, but now I am having exactly the problem you report, and need to run it in bash instead.

rien333 commented 4 years ago

I don't think so really, just switched to bash whenever I wanted to use this. However, I can't test this now, because for some reason libmupdf (need for the python bindings) seems broken on Arch Linux.

dsanson commented 4 years ago

For anyone who finds this, a temporary fix is to patch fish to remove the changes that caused the problem. See https://github.com/fish-shell/fish-shell/issues/6994#issuecomment-633303722