AnonymouX47 / term-image

Display images in the terminal with python
https://term-image.readthedocs.io
MIT License
219 stars 10 forks source link

Scanning directories and loading animated images too slow #10

Closed leo-arch closed 2 years ago

leo-arch commented 2 years ago

Description First of all, great work. I'm really interested on it. Now, everything goes fine when you open a single file or several files using a glob expression (e.g. ~/Downloads/*.png). However, whenever you want to browse a directory, it takes too long (like 20 secs or more) to load the TUI viewer (while recursively scanning image files in the given directory). Once the TUI finally loads, it is also a bit slow whenever you hover a directory (a few seconds depending on the directory).

To Reproduce term-img DIR or term-img DIR/*

Expected behavior term-img should be quicker and smoother when recursively browsing directories

Desktop:

Package info:

Terminal Emulator:

AnonymouX47 commented 2 years ago

Thanks for the feedback. 😃

Concerning directory searching:

Since you didn't use the -r/--recursive option, then the search shouldn't take up to a second because all the search does is to identify at least one image file immediately in the directory and that's all.

Using the -r option, The duration of the search is dependent on:

  1. The amount of subdirectories.
  2. The amount of non-image files encountered.

This is due to the fact that it identifies image files by their content (just the header, first few bytes) and not by filename extension.

Concerning Loading directories in the TUI:

Note: The TUI itself doesn't take time to show up, it's loading (and partially rendering) the images that takes a while.

When the a directory is loaded, all files in the directory are checked and the non-images discarded.

Rendering only affects partially because only the image(s) currently in view is/are rendered.

I'll see what I can do about making the loading faster. Any ideas?

leo-arch commented 2 years ago

Thanks for your quick answer! Here's a more precise description of the issue:

I run: ./term-img ~/Downloads, and this is the output:

Checking directory '/home/user/Downloads/'...
... Done!

It gets stuck here for exactly 46 secs, taking 50% of my CPU.

Once it finally loads the TUI, it gets stuck there (hovering on Downloads/) for another 40 secs or so (taking 50% of the CPU again) before I can actually do something.

As to recommendations to make the loading thing faster, I didn't actually look at the code, so I cannot recommend anything useful. Maybe I could take a look at it to see what could be causing this issue.

EDIT: It seems to happen mostly with my Downloads folder. With other directories it works much faster (almost as expected, I'd say). My bet is that whatever image previewer you're using is having trouble dealing with some specific file type.

leo-arch commented 2 years ago

Well, after some tests I conclude that the source of the issue is actually related to some file types. Previewing postscripts files could be a bit slow. In my case, the main obstacle was a really big GIF file (2187 frames!). I guess you could provide a command line switch to disable automatic previews for GIF files, or, even better, find a programmatic way to not automatically reproduce GIF files with more than, say, 100 frames.

AnonymouX47 commented 2 years ago

Wow! That's a lot of time... It took way less than that for the entire DCIM directory (with thousands of images) on my (not recent) android device.

It seems to be an edge case like you've described. I'll try getting a sample image of that kind to test.

EDIT: I suppose the file will also be really large... among the planned features is an option to specify a maximum file size which I plan to set to a reasonable default.

AnonymouX47 commented 2 years ago

As for the code responsible...

term_img.tui.main.scan_dir() scans a directory and creates Image widgets for every image in it and a generator for subdirectories.

term_img.cli.check_dir() handle the scanning of directories to determine if they contain images or subdirectories that to.

leo-arch commented 2 years ago

Whichever way you find to tackle this issue will be fine.

This is a principle I follow whenever I write some new feature for my programs: no matter how cool I think it is, if it might noticeable reduce the program's performance (or somehow produce a negative impact on the user), just disable it by default and allow the user to enable it via some command line switch or option in the config file. Users get quickly disappointed and give up as soon as they experience some issue with your program, and that's something that we, as developers, need to prevent at all costs.

Please let me know if you find some way to improve this loading issue.

AnonymouX47 commented 2 years ago

I really appreciate your quick reply... that's one golden principle.

I'm currently looking into it. 😃 Thank you very much.

AnonymouX47 commented 2 years ago

Well, after some tests I conclude that the source of the issue is actually related to some file types. Previewing postscripts files could be a bit slow. In my case, the main obstacle was a really big GIF file (2187 frames!). I guess you could provide a command line switch to disable automatic previews for GIF files, or, even better, find a programmatic way to not automatically reproduce GIF files with more than, say, 100 frames.

Any idea where I can get such a GIF?

leo-arch commented 2 years ago

This is the big GIF file I was talking about:

https://mega.nz/file/x4p1ma4B#PpYiUtwWFl5kUKOkyGavV9lXLuD4qsHB9uLRs1rfZvM

AnonymouX47 commented 2 years ago

Thanks

AnonymouX47 commented 2 years ago

After investigating... turned out it's actually the large number of frames that's responsible.

During initialization, the number of frames is stored for subsequent use later. 👇🏾 https://github.com/AnonymouX47/term-img/blob/926b4d1433a7b572bdc6236b4642153787b14aad/term_img/image.py#L100

It's getting the number of frames (image.n_frames) that takes forever. (Note: n_frames is a descriptor not just a data attribute, so it actually executes code that returns the required value) There's also some lesser delay when performing seek operations on the PIL Image instance, the magnitude of the delay depends on the distance between the current position and the new position.

Looking up the specification of the GIF format, there's no portion of it that specifies the number of frames in the image, so it has to be determined by reading almost all through the entire image.

This being the case, I don't think selecting GIFs to display based on frame count will work, since it's not possible without first determining the frame count. I think a "max size" option is the only way I can see at the moment, since the file size should be proportional to the frame count and the file size is actually what causes the program to spend so long reading the file to determine the frame count.

As for other file types... I think a "max size" option and the existing "max pixels" option should suffice.

AnonymouX47 commented 2 years ago

Concerning scanning directories to get all images in it... I'm considering some form of concurrency or parallelism where a separate thread or process preloads directories up to a certain depth (like a "look forward").

Then for the first directory or top-level (i.e containing the sources passed at the command-line), loading images could be gradual, checking the files starting with the smallest in size... and they get added to the menu/grid as they're ready.

Note: I haven't implemented any of these yet, still theoretical. 😃

leo-arch commented 2 years ago

At least in the case of this huge GIF file, the size approach should work: its size is 8mb, a lot for a simple image file. However I'm not sure about what would be a sane default value for max size. A postscript file could take 3mb easily, and previewing it is quite CPU intensive. Maybe 1mb is a good default value.

AnonymouX47 commented 2 years ago

Hmm... True. A sane default could be difficult to come by but in the end, the user could simply change the config value. 🤷🏾‍♂️ Maybe the size could apply only to certain image types. 🤔

leo-arch commented 2 years ago

As to the idea of gradually loading/displaying images, I guess that's a nice approach, provided the loading process runs on a separate thread and the TUI stuff on another one. The process should be as smooth as possible, even if this implies not previewing some files at all. Maybe a message in the previewing panel warning the user about the file not being displayed and allowing him/her to forcefully load the image by means of some keyboard shortcut. Just thinking aloud.

AnonymouX47 commented 2 years ago

Exactly the same thoughts here.

As for the "forced load", I already implemented something similar for images above the "max pixels" so I'll simply extend that.

leo-arch commented 2 years ago

Maybe the size could apply only to certain image types

Absolutely. Simple image files like PNG and JPG do not seem to cause any issue (as of now). For the time being, both GIF and postscript should be taken into account regarding max size.

AnonymouX47 commented 2 years ago

Good. I'll approach it the problem this way then.

Thanks so much.

AnonymouX47 commented 2 years ago

Hello @leo-arch !

It's definitely been a while :smiley: but the good news is... time hasn't passed in vain.

Here is an highlight of the performance-related fixes and improvements I've implemented since, as regards this issue:

For the library:

For the CLI:

For the TUI:

Footnotes

  1. I reported the issue to the Pillow developers and opened a PR to fix the issues, though my changes would've broken some key things. :disappointed: So, one of the core developers implemented the improvements in a better way that wouldn't break things. See https://github.com/python-pillow/Pillow/pull/6077 The changes have been merged and will be released with the next feature release 9.1.0 by April 1. When this version is out, I plan to revert the changes to TermImage.n_frames because the performance will be smooth enough.

  2. When Pillow 9.1.0 is out, I'll be extending the caching criteria of ImageIterator to cache frames based on a given maximum number of frames, which is a better metric than file size.


If you don't mind, please update your installation from the main branch to test the changes.

I'll gladly appreciate your feedback. Thanks :smiley:

AnonymouX47 commented 2 years ago

In addition to these... I'll also be implementing the following before releasing 0.2.0:

leo-arch commented 2 years ago

Hey @AnonymouX47! Sorry for the delay.

It's much better now! Congrats and thanks! I really like this. I guess PIL internally uses something along the lines of cacalib or chafa (i.e., ASCII/ANSI rendering). Do you plan to add support for ueberzug, w3img, sixel or kitty? These protocols are really good at displaying images on the terminal, and besides they are more and more widely adopted.

I myself implemented a files previewer (not only images, but also PDF, document, postscript files, and even sound files) using fzf and ueberzug for my CliFM, but only as a plugin (a shell script indeed): it's nice and all, but far from ideal. A third party utility able to do this smoothly, and able to be integrated into CliFM, would be really nice. And term-img is on the right track for sure.

Keep up the good work!

AnonymouX47 commented 2 years ago

It's much better now! Congrats and thanks!

Great, happy to hear this!

I guess PIL internally uses something along the lines of cacalib or chafa (i.e., ASCII/ANSI rendering)

Oh, no. PIL only decodes the images, the rendering is implemented here in term-img.

Do you plan to add support for ueberzug, w3img, sixel or kitty?

Another developer using the library also suggested the likes of this. I'm currently looking at adding support for some other character-based styles and sixel, kitty and iterm2 protocols. I had checked out ueberzug before but not yet decided if I want to add support... I remember it has Python bindings, so I guess is shouldn't much of a hassle. :thinking: I'll look into w3img also.

Though before any of these will be implemented, a major API change is required as regards the sizing unit (See #16).

I myself implemented a files previewer (not only images, but also PDF, document, postscript files, and even sound files) using fzf and ueberzug for my CliFM.

Yeah, I did check it out :smiley:... Great work :+1:

A third party utility able to do this smoothly, and able to be integrated into CliFM, would be really nice. And term-img is on the right track for sure.

True, I'll put more work into it. :smiley:

Thanks so much, I really appreciate your feedback and suggestions... would love to see more of that.

I'll update you on the protocol support issue as I progress.

leo-arch commented 2 years ago

I'll update you on the protocol support issue as I progress

Thanks! That would be great.