JacobDev1 / xl-converter

Easy-to-use image converter for modern formats. Supports multithreading, drag 'n drop, and downscaling.
https://codepoems.eu/xl-converter
Other
178 stars 5 forks source link

Process cjxl.exe can easily take up 100% of the memory at a certain point and completely paralyze the computer. #49

Closed Ukhryuk-Hai closed 3 months ago

Ukhryuk-Hai commented 3 months ago

Here are a few conditions:

  1. If the image size has too high a resolution.

  2. If the image is corrupted. I have encountered situations when cjxl.exe kept trying to process it and eventually had to press the reset button.

  3. The most typical situation is when I set up a lot of small images (up to 2 MB in size) for conversion with 6 threads, but one or more large ones are found among them, which leads to the problem.

So, I suggest you modify the program logic as follows:

  1. Dynamically change the number of threads depending on the image resolution.

  2. Build in a mechanism for emergency forced termination of all copies of cjxl.exe if the amount of free memory has decreased to 10%.

I have already asked ChatGPT to write a Python script to terminate cjxl.exe, but it would be better if this functionality is integrated directly into the converter.

JacobDev1 commented 3 months ago

v1.0.2 will feature a new "Multithreading" option under Settings -> Conversion. You can switch from "Performance" to "Low RAM" which will help when dealing with large images. Sort images by resolution and process really big ones with the "Low RAM" mode.

When encoding large images to JPEG XL, you may want to trigger the streaming encoding. Use the following settings:

Streaming encoding is very easy on RAM. However, it is only available under certain circumstances in cjxl.

As for proposed changes, those are surface level "fixes", and not a long-term solution. I may rework the way encoding is done at some point, but that is a lot of work. You can keep using your script if you are happy with it.

Ukhryuk-Hai commented 3 months ago

Am I correct in understanding that you think it is too complicated to automate sorting and use only one thread for large images?

Ukhryuk-Hai commented 2 months ago

Here is a script, it starts XL-Converter, monitors if there is less than 10% of memory left and terminates all cjxl.exe processes. If there are no problems, then it terminates its work after closing the main program.

I highly recommend using this thing to everyone who has 8 GB of memory or less.

import psutil
import subprocess
import time
import os
import winsound
import sys

# Path to xl-converter.exe
xl_converter_path = r"C:\Program Files (x86)\XL Converter\xl-converter.exe"

def terminate_cjxl_processes():
    for proc in psutil.process_iter(['pid', 'name']):
        if proc.info['name'] == 'cjxl.exe':
            try:
                p = psutil.Process(proc.info['pid'])
                p.terminate()
                print(f"Process {proc.info['pid']} (cjxl.exe) terminated")
            except psutil.NoSuchProcess:
                pass

def monitor_memory_usage(threshold=10):
    time.sleep(5)  # Delay before the first check
    while True:
        time.sleep(1)
        memory_info = psutil.virtual_memory()
        free_memory_percentage = memory_info.available * 100 / memory_info.total

        if free_memory_percentage <= threshold:
            print("Error: Less than 10% memory available. Terminating cjxl.exe processes...")
            winsound.Beep(1000, 1000)  # Beep sound
            terminate_cjxl_processes()
            return False

        if not any(proc.name() == 'xl-converter.exe' for proc in psutil.process_iter()):
            print("Process xl-converter.exe has finished. Exiting script...")
            return True

if __name__ == "__main__":
    try:
        # Start xl-converter.exe
        xl_converter_process = subprocess.Popen(xl_converter_path)
        print(f"Process xl-converter.exe started with PID {xl_converter_process.pid}")
    except Exception as e:
        print(f"Failed to start xl-converter.exe: {e}")
        sys.exit(1)

    # Monitor memory usage
    if monitor_memory_usage():
        xl_converter_process.wait()  # Wait for xl-converter.exe to finish
    else:
        xl_converter_process.terminate()
JacobDev1 commented 2 months ago

This script could introduce nasty bugs related to subprocesses like in #44 The program should be called by the user.

The "Low RAM" mode is perfectly fit for this purpose. It is how encoders were intended to be run by their developers. Use Effort 7. Effort 9 is extremely RAM-hungry.

"Low RAM" mode:

Resolution Effort (VarDCT) RAM Required (approx.)
7680 × 4320 7 350 MB
7680 × 4320 9 7.75 GB

I did some tests and the only case where high RAM usage is a problem is when you're converting large images to JPEG XL with Efforts higher than 7. This includes "Intelligent Effort" and excludes "Lossless JPEG Transcoding".

The standard way to run these encoders is sequentially ("Low RAM" mode). The "Performance" mode eliminates I/O bottlenecks, but speed gains shrink as the images get larger.

Regarding your previous question, let's examine your suggestions.

Suggestion Potential Challenges
Sorting items by res., then processing the largest first sequentially, slowly increasing concurrent thread limit. The current time left algorithm becomes useless due to lack of uniform distribution. Users complain the conversion is "too slow to start up/finish".
Dynamically adjusting concurrent thread limit based on a scrambled data set. Lower performance, needed fine-tuning, expected edge cases.

The backend is planned to be reworked. This time-consuming code would get replaced anyway. While I understand your complaints, it's better to invest time into a more permanent solution.

Perhaps you should use something else. XnConvert, IrfanView, ImageMagick are all great choices.

Ukhryuk-Hai commented 2 months ago

I understand your position, but at the same time I think that should not take away the ability of users to choose different settings. For example, I am ready to spend more time in order to save images in better quality (JXL effort 10). And a temporary decrease in the number of threads for me personally seems more preferable and logical option than a forced decrease in quality, even if it is imperceptible to the eye. After all, there are plenty of people who prefer to listen to music in FLAC format, so why not?

As for fine-tuning, you can simply add an option where the user specifies several ranges of the total number of pixels in the image, and depending on this, process them in 1-N threads. For example (approximately), if the image is 4000 4000 = 16 000 000, then process in one thread, if 3000 3000 = 9 000 000, then in two, etc. This does not seem very difficult to implement.

As for the "emergency switch", I agree that this is something that should not exist - if the operating system itself could correctly handle such situations. But since reality is what it is, it is much better to start coding again than to reboot the computer. At the moment, it is absolutely necessary.

As for alternative programs, of course, I considered them. The problem is that they either have no settings at all, or a minimum. And it is also unclear what exactly is happening - lossless recompression of JPG-JXL or an irreversible change of format. And ImageMagick does not have a GUI at all.

P.S. By the way, I advise you to add the "JXL" tag in addition to the others, so that it is easier to find your program on this site.

JacobDev1 commented 2 months ago

Solutions to high RAM usage

I'm not taking anything away. I just added a new feature, and I'm suggesting solutions. You decide how to use the program.

Said solutions:

  1. Use "Low RAM" mode and pick any "Effort" you'd like. The conversion speed should be near-identical for very big images and lower for smaller ones.
  2. Use "Performance" mode but if you data set has some very large images, only up to "Effort" 7. For regular-size images, higher "Efforts" will work fine.

Processing very high resolution images with extreme presets like "Effort" 10 comes with caveats. This should be expected.

Encoding audio and images is different. Audio codecs (like FLAC) process audio in chunks.

$ /usr/bin/time -f "RAM Usage: %M KB" flac src.wav -o dst.flac -8 -f
[...]
RAM Usage: 3836 KB

As you can see, it's only 4 MB of RAM. This audio is 60 minutes of white noise.

In image processing, encoders typically need to look at the entire picture to optimize it. This is why they require so much RAM. The resolution scales quadratically in pixel count. Streaming encoding is much more complex to do here so it's not as widespread.

Streaming encoding should not be taken for granted. It is revolutionary. JPEG XL has recently added it. The article showcases how it works in "Efforts" up to 9. This is the API, the cjxl encoder supports it only up to "Effort" 7 but this will most likely change.

I'll try to optimize dynamic thread management, but by the time it arrives, it might not even be necessary. The official JPEG XL presentation speculates libjxl 1.0 may come out in Q3 2024. src

Alternatives

Lossless means no loss of data. The "Lossless JPEG Transcoding" is a special feature that requires extra considerations. Popular GUI image converters do not support it. This is a very niche feature. One shouldn't assume "Lossless" implies "Lossless JPEG Transcoding".

The only program with that special feature is cjxl.

I ship a UTF-8 patched version in xl-converter/bin/win/

Run cjxl --help -v -v to get more details. It is very clear what it does.

Example usage:

cjxl input.jpg output.jxl

Output:

JPEG XL encoder v0.10.2 e148959 [AVX2,SSE4,SSE2]
Note: Implicit-default for JPEG is lossless-transcoding. To silence this message, set --lossless_jpeg=(1|0).
Encoding [JPEG, lossless transcode, effort: 7]
Compressed to 125.8 kB including container 

Misc.

For the reference, I know how to write things. I'm not a hobby programmer.

Writing a feature is not the problem, maintaining it is. Adding a feature is 20% of the work, the other 80% is maintenance. This is a mind-numbing, never-ending chore that makes programmers burn out.

There are plenty of scientific articles on this topic.

More code = more chores.

I'm trying to say there are things users may not be aware of that affect the development. This calls for some understanding.

JacobDev1 commented 2 months ago
redmoon1945 commented 2 months ago

Great app, thank you for the time you invest :-)

Ukhryuk-Hai commented 1 month ago

Lossless means no loss of data. The "Lossless JPEG Transcoding" is a special feature that requires extra considerations. Popular GUI image converters do not support it. This is a very niche feature. One shouldn't assume "Lossless" implies "Lossless JPEG Transcoding".

The only program with that special feature is cjxl.

I remembered that there is at least one similar program that can compress JPG with approximately the same efficiency - PackJPG. But this is not a full-fledged graphic format, since there are no viewers for it, it is just a specific archiver.

JacobDev1 commented 1 month ago

JPEG XL uses Brunsli, which can also be used directly to generate brn files. You can try it here.

brn files cannot be opened by image viewers which is why using it through cjxl to transcode to JPEG XL is ideal.

The older alternatives are PackJPG and Dropbox Lepton. Those are less efficient and not maintained anymore.