hpjansson / chafa

📺🗿 Terminal graphics for the 21st century.
https://hpjansson.org/chafa/
GNU Lesser General Public License v3.0
2.84k stars 60 forks source link

[Proposal] Preview mode with live paramater adjustment #23

Closed klopsi closed 2 years ago

klopsi commented 5 years ago

It's a great success already! Chafa's preprocessor creates blotchy messes ouf of images with colors far from primary r.g.b though. One just needs to learn the tool.

Chafa vs Unicolexport

However I find myself going back and forth between GIMP and chafa many times, tweaking color gradients and hoping for the best.

This working loop might be made quicker if chafa had a preview mode wherein the user could, for e.g raise/lower r,g,b gamma with R/r, G/g, B/b. or gamma of all values with V/v. Likewise, preproocess could be toggled, or given a variable slider, contrast/brightness could be added, all adjustable in live preview window.

Once result is acceptable user would hit S to save or something.

Thanks, cheers

hpjansson commented 5 years ago

Thanks for filing the issue!

I agree, an interactive tool would be useful. I'm not sure if it should be in chafa proper or in a separate tool -- I think of the former as a "batch tool" wrapping the library core as minimally as possible. If interactive editing were separate, we'd also be much freer to add all kinds of effects there without worrying about cluttering chafa with more dependencies, documentation, etc.

For images that the preprocessor does poorly on, I sometimes have better luck with -p off combined with --fill ascii or --fill braille (since "fill" dither is only applied to blank cells, this works better with bigger output images because there's more flat area). Plus optionally adding --color-space din99d if there is still too little color.

Ideally the preprocessor would do the best possible job automatically. It would be useful to have an example image that came out badly (like the one you made) and a tweaked input image that produces a good result to see how far off the mark it is right now.

In general it's hard to make people look good with -c 16 -- unless you're using subcell dithering like Unicolexport is -- so I think we need to support that too (I'll make a separate issue for it).

hpjansson commented 5 years ago

@klopsi I've added a few features to master that might improve output for your use case somewhat: --dither, --dither-intensity and --dither-grain.

hpjansson commented 2 years ago

Going to roll these suggestions into the general interactive mode work.

Duplicate of #25

clort81 commented 1 year ago

Hi , done a lot of conversions now with tiv and chafa, wanted to note as an aside how much faster chafa is:

$time tiv -w 140 forest_031big.jpg > /dev/null
real    0m5.163s
user    0m1.909s
sys     0m0.222s
$ time chafa --size=140 forest_031big.jpg > /dev/null
real    0m0.196s
user    0m0.382s
sys     0m0.019s

That's on arm cortex A73 cpu. Would be interesting to see if the SSE2NEON stuff could improve that.

But on-topic, after a few thousand images, let me tell you the workflow to get really good results. You have to shift around the image, make it larger or smaller, and be able to adjust individual elements, so they line up with character boundaries where you need detail.

This can be done in GIMP with the warp tool and scaling, cropping etc, but an interactive chafa editor where you can scale input image, output text, shift input left or right by pixels, and eventually shift selected regions would be a huge productivity gain.

Eventually the editor should allow you to hilight with mouse a region that looks good, and 'lock that in' while you make shift-adjustments to the rest of the image.

You've done all the excellent documentation to allow me to do this with the api, I just have to get in-gear to do it.

But if you have any unreleased interactive chafa code, (c or python) i'd like to see it and not duplicate work.

Thanks hpjansson, you're great.

hpjansson commented 1 year ago

Hi , done a lot of conversions now with tiv and chafa, wanted to note as an aside how much faster chafa is:

$time tiv -w 140 forest_031big.jpg > /dev/null
real    0m5.163s
user    0m1.909s
sys     0m0.222s
$ time chafa --size=140 forest_031big.jpg > /dev/null
real    0m0.196s
user    0m0.382s
sys     0m0.019s

Music to my ears :-)

That's on arm cortex A73 cpu. Would be interesting to see if the SSE2NEON stuff could improve that.

Indeed. Chafa isn't optimized for arm, so there should be plenty of gains to be had there.

But on-topic, after a few thousand images, let me tell you the workflow to get really good results. You have to shift around the image, make it larger or smaller, and be able to adjust individual elements, so they line up with character boundaries where you need detail.

This can be done in GIMP with the warp tool and scaling, cropping etc, but an interactive chafa editor where you can scale input image, output text, shift input left or right by pixels, and eventually shift selected regions would be a huge productivity gain.

Eventually the editor should allow you to hilight with mouse a region that looks good, and 'lock that in' while you make shift-adjustments to the rest of the image.

I've thought about this, and even explored a mode that would automatically try different offsets of each cell to minimize local error, but the result wasn't great because the cells no longer connected. As you're saying, it needs guidance in order to work well. I hadn't thought about marking and locking in regions, that's a good idea!

You've done all the excellent documentation to allow me to do this with the api, I just have to get in-gear to do it.

But if you have any unreleased interactive chafa code, (c or python) i'd like to see it and not duplicate work.

Unfortunately I don't have anything finished yet, but I'm slowly working towards an interactive mode in Chafa (adding support code to the library, and soon an event loop). When that's done it would be possible to make such an application work in the terminal. I can't say how long it will take (day job will take up most of my energy), but 2-3 months maybe.

Making an app with a GUI (e.g. with GTK) would be possible right now, but then you'd have to code a graphical character cell grid widget to display the art.

By the way, @GuardKenzie is working on Python bindings at https://chafapy.mage.black/ -- an amazing effort. So if Python's your jam, there you go.

Thanks hpjansson, you're great.

Thanks, it means a lot!

clort81 commented 1 year ago

I've thought about this, and even explored a mode that would automatically try different offsets of each cell to minimize local error, but the result wasn't great because

I was thinking about this but a pixel-counting PSNR wouldn't be a good test. Is there something that is a good juge of human-subjective image quality, that pays attention to important details like eyes, hands, etc but doesn't care so much about exact placement, or colors?

cdluminate commented 1 year ago

I've thought about this, and even explored a mode that would automatically try different offsets of each cell to minimize local error, but the result wasn't great because

I was thinking about this but a pixel-counting PSNR wouldn't be a good test. Is there something that is a good juge of human-subjective image quality, that pays attention to important details like eyes, hands, etc but doesn't care so much about exact placement, or colors?

Unfortunately "a good judge of human-subjective image quality" is a very cutting-edge research area. Even the state-of-the-art deep generative models are still using metrics like PSNR, SSIM, FID, etc., to measure image quality, while all of them suffer from some kind of misalignment to human perception. I'd say, at the current stage, the best metric to decide image quality is still human.

hpjansson commented 1 year ago

This is very interesting to me. I've been discussing the high end of character art synthesis with a few people over the years and have amassed a huge link folder and some vague ideas of my own. A couple of things:

I've kept Chafa simple because as @cdluminate says, this is an active research area and I don't want to implement something that may turn out to be impractical, hard to maintain or a technological dead end. Also, plenty of simpler things that need doing!

There are probably things that can be done with transforms, filtering and decomposition, e.g. to detect dots, circles and other kinds of shapes that may be well represented by some character but would not match the literal pixel matrix (due to e.g. misalignment, aspect, scale). @csdvrx had some very interesting thoughts on this; she brought up the Hough transform, band-pass filtering and many other things in a conversation we had. This would be cheap computationally, but would require some experimentation before arriving on a fruitful approach that can be implemented.

As far as I know, this is still the state of the art: It's a character art CNN. However, it is laser focused on Shift-JIS with a variable-width font (12pt MS Gothic, an old MS Windows default). I think a similar approach could work for fixed-width fonts and colors, but it would be harder to find or generate training material for this, since color adds complexity and you'd want it to work on things that aren't just line art. Also, a CNN model would be too large to ship with Chafa (likely hundreds of megabytes).

I have a vague, unexplored idea that an statistical approach trained on ANSI art, where you count the occurrences of particular characters in the immediate neighborhood of each other (i.e. 8 surrounding characters of each character type) could yield a suitably small model that could work as a correction factor by applying the counts as weights after the initial matrix pass. That means a character that "lost" the matrix pass could be nudged into a winning position due to its context. For instance, if there are horizontal line characters to the left and right of it, a horizontal line character may get extra weighting because unbroken lines are favored by the training data. I did some back-of-the-envelope calculations and think that with trimming and compression the model might be kept under 5MB.

This would be a lot of fun to explore further, if only there were time :-)