kovidgoyal / kitty-fosshack2024

Projects for FOSSHack 2024
GNU General Public License v3.0
9 stars 5 forks source link

Image browsing in the terminal #4

Open kovidgoyal opened 3 months ago

kovidgoyal commented 3 months ago

Implement a simple image browsing and viewing kitten that works in the terminal using the kitty graphics protocol.

The existing kitty icat kitten can be used to dump one or more images already however this kitten, tentatively named iv, would allow browsing a folder recursively for images presented in a grid view with thumbnails and keyboard and mouse controls to select the images and perform basic operations like view them at full size, delete them, rename them, etc.

The kitten would have two modes, browse images and view single image. Would start in browse images mode unless only a single image is specified to view.

Can use the existing kitten infrastructure to do a lot of the heavy lifting. See the icat kitten and also the diff kitten (which displays image diffs).

Should support animated thumbnails and animated image display for image formats that support animation such as GIF.

Would be written purely in Go.

swastkk commented 1 month ago

Hey @kovidgoyal, our team has decided to work upon this issue under FOSS Hack 2024 and has a basic wireframe completed. Please assign us this issue.

kovidgoyal commented 1 month ago

done.

PythonHacker24 commented 1 month ago

Hey Kovid, I am in @swastkk teams for the FOSS Hackathon and we were engaged in research on implementing the given functionality. I would like to present a short proposal on what the end kitten looks like and how we are basically structuring it.

FOSS-HACK24

Recursive Image Grid Display Kitten for Kitty Terminal

Project Proposal

Objective

Implementing a feature for Kitty terminal as a kitten that allows recursive display of images in a responsive grid layout, adapting to window size changes and handling images with different aspect ratios for better browsing of images in a directory. This kitten is purely written in Golang and allows users to enter a small session where images can be browsed, viewed on a bigger page and return back to console smoothly.

Key Features

  1. Recursive directory scanning for images
  2. Dynamic grid layout calculation
  3. Efficient image resizing and caching
  4. Responsive handling of window size changes
  5. Keyboard navigation for image browsing
  6. Integration with Kitty's graphics protocol

Implementation Strategy

1. Image Discovery and Metadata Extraction

2. Grid Layout Calculation

3. Image Resizing and Caching

4. Window Size Handling

5. Image Rendering

6. Keyboard Navigation

Project Structure

File Descriptions

Here is the basic experimentation file structure we were working with, divided into various functionalities. It doesn't represent the end goal, just a view of what's going on around our minds.

This structure organizes the project into logical components, separating concerns and making the codebase more maintainable. Each file focuses on a specific aspect of the functionality, allowing for easier development, testing, and future enhancements.

PythonHacker24 commented 1 month ago

We would like to have any advice about the approach for the implementation of the functionality from you. It would help us optimize or even upgrade our software design.

PythonHacker24 commented 1 month ago

Also, we would like to know the mode for contribution. Like do we have to get kitty forked and have a PR or create our own repository?

PythonHacker24 commented 1 month ago

And since the hackathon is gonna start and we need to get things done in like 2-3 days, how much do you expect that we should get completed as a team of 4 people?

kovidgoyal commented 1 month ago

On Fri, Jul 26, 2024 at 10:50:39AM -0700, Aditya Patil wrote:

Hey Kovid, I am in @swastkk teams for the FOSS Hackathon and we were engaged in research on implementing the given functionality. I would like to present a short proposal on what the end kitten looks like and how we are basically structuring it.

FOSS-HACK24

Recursive Image Grid Display Kitten for Kitty Terminal

Project Proposal

Objective

Implementing a feature for Kitty terminal as a kitten that allows recursive display of images in a responsive grid layout, adapting to window size changes and handling images with different aspect ratios for better browsing of images in a directory. This kitten is purely written in Golang and allows users to enter a small session where images can be browsed, viewed on a bigger page and return back to console smoothly.

Key Features

  1. Recursive directory scanning for images
  2. Dynamic grid layout calculation
  3. Efficient image resizing and caching
  4. Responsive handling of window size changes
  5. Keyboard navigation for image browsing
  6. Integration with Kitty's graphics protocol

This is all fine. As a stretch goal you can add mouse integration.

Implementation Strategy

1. Image Discovery and Metadata Extraction

  • Implement a recursive directory scanner in icat/discovery.go

This should be a new kitten in kittens/iv/

  • Use Go's image package to extract basic metadata (dimensions, format)

Go's image package is very limited in format support. See the tools/images/ package in kitty source code that integrates with both native Go image and ImageMagick. In particular the OpenImageFromPath function is what you need.

  • Utilize goroutines for concurrent processing of multiple images

2. Grid Layout Calculation

  • Develop an algorithm in icat/layout.go to calculate optimal grid dimensions

This will be in iv/layout.go

  • Consider terminal dimensions and number of images
  • Implement adaptive sizing to maintain aspect ratios

3. Image Resizing and Caching

  • Use the github.com/disintegration/imaging library for efficient resizing

Again, use the existing code in tools/images for this, it is better than disintegration as i works with NRGB/NRGBA images which is what you need to work with the kitty graphics protocol.

  • Implement a simple in-memory cache in utils/cache.go

You dont need an in memory cache. You need two things:

1) A thumbnail on-disk cache for the grid layout.

This will contain RGB/RGBA data of thumbnails of the images suitable for use with th ekitty graphics protocol. It must be on-disk anyway so that you can directly send file paths to render to kitty for max efficiency, when running on the same machine.

2) A render cache for the full size layout Same as above but for the full size view with zoom levels so rendering at different sizes

  • Consider implementing a worker pool for parallel processing

4. Window Size Handling

  • Utilize Kitty's existing window size detection mechanism
  • Implement a debounce function to limit layout recalculations
  • Update grid layout and re-render images on size changes

You should structure your kitten using the existing tui.loop package this gives you resizing info and so on for free.

5. Image Rendering

  • Extend the existing icat kitten in icat/render.go

Again, this should be a separate kitten it can of course import code form icat as needed.

  • Use Kitty's graphics protocol for efficient image placement
  • Implement batch rendering for improved performance

6. Keyboard Navigation

  • Implement keyboard event handling in icat/navigation.go
  • Use arrow keys for navigation and 'q' for quitting
  • Highlight the currently selected image

You will eventually need to implement extended selection, but for a MVP a single selection is sufficient.

Project Structure

File Descriptions

Here is the basic experimentation file structure we were working with, divided into various functionalities. It doesn't represent the end goal, just a view of what's going on around our minds.

  • main.go: This is the entry point of the icat kitten. It contains the main logic and orchestrates the overall functionality.

  • discovery.go: Handles recursive directory scanning and image metadata extraction. It identifies image files and gathers basic information like dimensions and format.

  • layout.go: Implements the grid layout calculation algorithm. It determines the optimal arrangement of images based on the terminal window size and the number of images.

  • render.go: Manages the rendering of images using Kitty's graphics protocol. It handles the placement and display of images in the calculated grid layout.

  • navigation.go: Implements keyboard navigation functionality, allowing users to browse through the displayed images using arrow keys or other defined shortcuts.

  • cache.go: Located in the utils folder, this file implements image caching mechanisms to improve performance by storing and retrieving processed images.

This structure organizes the project into logical components, separating concerns and making the codebase more maintainable. Each file focuses on a specific aspect of the functionality, allowing for easier development, testing, and future enhancements.

That's fine.

kovidgoyal commented 1 month ago

On Fri, Jul 26, 2024 at 10:53:33AM -0700, Aditya Patil wrote:

Also, we would like to know the mode for contribution. Like do we have to get kitty forked and have a PR or create our own repository?

Fork and PR is fine.

kovidgoyal commented 1 month ago

On Fri, Jul 26, 2024 at 10:54:35AM -0700, Aditya Patil wrote:

And since the hackathon is gonna start and we need to get things done in like 2-3 days, how much do you expect that we should get completed as a team of 4 people?

Depends on your skill levels :) But I suggest you aim for a basic MVP that shows the grid and the full screen view with simple keyboard controls.

I would hope some of you will continue to work on it after the MVP as there will be a lot more work required before it reaches shippable quality.

5hubham5ingh commented 3 weeks ago

Hello, I have a script that provides similar functionality to browse images in a grid and it works fine but the script uses icat to display images one by one which is not efficient.

https://github.com/user-attachments/assets/e724d3d1-4b2e-467a-8719-29255070f2c3

I am trying to run the icat in parallel but then few of those commands failed with this error- Error: The --place option can only be used with a single image, not 2 ^[_Gi=1;OK^[\^[_Gi=2;OK^[\^[_Gi=3;OK^[\^[[?62;cError: This terminal does not support the graphics protocol use a terminal such as kitty, WezTerm or Konsole that does. If you are running inside a terminal multiplexer such as tmux or screen that might be interfering as well. PicGridParallel

This is how I am trying to run the icat in parallel- kitten icat --scale-up --place 42x10@88x2 /home/ss/wallpaper/train1.webp & kitten icat --scale-up --place 42x10@131x2 /home/ss/wallpaper/forest2.webp & kitten icat --scale-up --place 42x10@2x13 /home/ss/wallpaper/mountain_and_lake.jpg & kitten icat --scale-up --place 42x10@45x13 /home/ss/Downloads/wallpaper/train3.jpg &

What am I doing wrong and is there a better way to run icat in parallel than this?

kovidgoyal commented 3 weeks ago

icat does I/O with the terminal by default, so you cant run more than one instance of it. If you want to run more than one instance, then you need to use it integration mode see the docs for how to do that, https://sw.kovidgoyal.net/kitty/kittens/icat/