sn4k3 / UVtools

MSLA/DLP, file analysis, calibration, repair, conversion and manipulation
GNU Affero General Public License v3.0
1.15k stars 103 forks source link

[Bug] Constant crashes on Linux #830

Closed tddts closed 6 months ago

tddts commented 6 months ago

System

UVtools v4.1.0 X64
Operative system: Linux 5.15.0-92-generic #102-Ubuntu SMP Wed Jan 10 09:33:48 UTC 2024 X64
Processor: AMD Ryzen 7 7700X 8-Core Processor
Processor cores: 16
Memory RAM: 5.18 / 30.49 GB
Runtime: linuxmint.21.3-x64
Framework: .NET 6.0.26
AvaloniaUI: 11.0.7
OpenCV: 4.8.1

Sreens, resolution, working area, usable area:
1: 1920 x 1080 @ 100% (Primary)
    WA: 1920 x 1040    UA: 1920 x 1040

Path:       /tmp/.mount_UVtoolQbQSOE/usr/bin/
Executable: /tmp/.mount_UVtoolQbQSOE/usr/bin/UVtools

Printer and Slicer

Description of the bug

App crashes whether while opening a file or trying to open "issues" tab. The whole system freezes while app is loading/calculating stuff. There is no error log in "settings folder" and no output if I run it from the terminal.

I've had similar crashes before with some files. Last time the app was crashing constantly I moved back from 4.0.6 to 4.0.5 and it kinda worked. Now I reinstalled my whole system (for reasons) and even 4.0.5 does not work.

How to reproduce

  1. Have latest Linux Mint
  2. Try opening file
  3. Experience crash
  4. If it did not crash, try opening "issues" tab, go to 3.

Files

There are 2 files here. First one is a simple cube I've created in lychee, it opens fine. The second file opened once but it crashed after I tried to open "issues" tab, now it doesn't open at all, crashing on load. https://drive.google.com/drive/folders/1klkspVe7JEjMR90WdmTua67er07E-XOX?usp=sharing

github-actions[bot] commented 6 months ago

This is your first time submitting an issue with UVtools 🥳Please review your issue and ensure that the submit template was followed, the information is complete, and not related to any other open issue. It will be reviewed shortly. Debugging is very important and make the program better. Thanks for contributing and making the software better! 🙌

sn4k3 commented 6 months ago

I think the problem is the lack of RAM to process such file using the system default all power. Linux handle the lack of RAM worse than Windows. Do you have SWAP?

On windows: image

On my linux vm it crash while loading because RAM and SWAP hit 100% then system crash the app to save the OS.

image

Decoding 12k images in parallel is a RAM hog, the solution is to decrease core count per workload on UVtools settings, see max degree of paralelism under Tasks settings. The more cores you have the more images it decode/encode in parallel. So working with n * 12k bitmaps or you have RAM or it will crash. The better CPU you have the more RAM you need to be able to feed all that CPU power, each core handles 1 bitmap them more cores = + ram.

Opens fine (smaller spike) vs dont open (big spike)

image

The opens fine file consume less RAM due the less layers to process, there is less pressure.

sn4k3 commented 6 months ago

However the insane amount of RAM it shows using makes little sense, because with your core count the calcs aim for 1GB. For that reason, I reviewed the code for Goo and find that I forget to dispose the Mat after using it, so object remain alive even if not required. That means it decodes and keep all 12K bitmaps into RAM and only auto release them when its all processed via GC. So the last tip to reduce core count per workload won't work as objects will still be alive...

I will fix this on next patch. Only affecting goo.

Difference after fix, on the larger file:

image

tddts commented 6 months ago

I tried to decrease parallelism to one, but it still consumed all of my RAM + 7Gb of swap (which I did not even expect to be used at all with such amount of RAM, but now it seems to be too small) and died. I tried to slice the same model into .ctb and got the same result when opening it. So I guess its not only about .goo files.

sn4k3 commented 6 months ago

Can you attach the CTB file?

tddts commented 6 months ago

here: https://drive.google.com/file/d/1C_aamvHNi6menlZlaaZMfuE_n2rRr9mw/view?usp=sharing

sn4k3 commented 6 months ago

Ok, EncryptedCTB is also affected by same problem, after that I review all other file formats and found that SVGX is also with same problem. It's strange that no-one complained before with CTB, given the big resolutions and high layer count...

sn4k3 commented 6 months ago

Check new version, should be fixed, including the issue detection leak

tddts commented 6 months ago

4.2.0 works fine with previous files. I tried opening a bigger file (2828 layers, ~624Mb) With parallelism set to Auto (-1) (LZ4 compression) it opens fine, but on island detection it also consumes all RAM and dies. With parallelism set to one on island detection it consumes up to 5.6 Gb of RAM and works normally. With parallelism = 10 it consumes up to 7 Gb of RAM. With parallelism set to Max (16/16) it consumes up to 8.6 Gb of RAM. I think there is something wrong with Auto setting.

sn4k3 commented 5 months ago

That file is huge and using almost all space in 12k, little to no optimization is made in that case. The detection of issues will escalate the RAM since per operation it needs multiple Mats, in some cases 2 to 4 and on resin traps need to build a cache of Mat's which is not small.

The Auto (-1) parallelism is not handled by the program but the NET framework: ParallelOptions.MaxDegreeOfParallelism.

Special this lines:

... If it is -1, there is no limit on the number of concurrently running operations (with the exception of the [ForEachAsync] (https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.parallel.foreachasync?view=net-8.0) method, where -1 means ProcessorCount).

When the thread pool's heuristics is unable to determine the right number of threads to use and could end up injecting too many threads. For example, in long-running loop body iterations, the thread pool might not be able to tell the difference between reasonable progress or livelock or deadlock, and might not be able to reclaim threads that were added to improve performance. In this case, you can set the property to ensure that you don't use more than a reasonable number of threads.

So, in that case the framework manages the creation of tasks based on what he thinks is best for performance. For example, if you pause a task, it will keep spawning them because there are an oportunity to process data (Since others are paused). If it's very busy it will wait an oportunity to spawn a new task. So in the end it may spawn more tasks than your CPU core count. In my case I have 32 and it spawn 34 to 36.

But if you choose a number, it will always stay within that limit.

Myself and what I recommend is to use the "!" (Optimal). It will define to your core count less a few to make sure some cores are free to make your system responsive, that way you can use your OS without much lag. -1 should be avoid if you active use your PC during UVtools processing, eg: web, videos, working.

Where the results on my system: image

The most memory is taken on Resin traps due the MatCache requirement. If your models are solid or you know they are ok for sure, please disable resin trap detection on such files.

As alternative you can upgrade RAM with more 32GB. It would make the difference in this cases. OR cheaper increase your SWAP, if you have in to SSD the performance is better still not as fast as RAM but will save you on this situations.