Ultimaker / Cura

3D printer / slicing GUI built on top of the Uranium framework
GNU Lesser General Public License v3.0
6.18k stars 2.08k forks source link

Very slow loading big STL files #10696

Closed ANTONIOPSD closed 2 years ago

ANTONIOPSD commented 3 years ago

Application Version

4.11

Platform

Windows 10 x64

Printer

Creality CR6-SE

Reproduction steps

I print lithophanes and some of them are +1.5gb big and when I try to lead them in Cura, it just crashes after trying to load thm for more than 5 minutes at 100% cpu, I can load them in Simplify3D in less than 40 seconds with no problems at all and also they get sliced in less than 30 seconds. I also create complex parts for some devices and the take less than 10 seconds to load and slice in Simplify3D but in Cura sometimes they take up to a minuto to load and slice with almost the same settings.

Is there any plans to upgrade the loading process and make it faser like some other slicers like Simplify3D?

STEPS:

Load a +1GB STL file

Actual results

Load a +1gb stl, hangs for some minutes at 100% cpu and crash or take ages to load

Expected results

Load a +1GB file, load fast.

Checklist of files to include

Additional information & file uploads

I just want to fully move to Cura and stop using other outdated slicers, but the speed difference with the same hardware is way too slow in cura.

Just try to load any huge STL file in Cura and other slicers and you will see how slow cura is.

fvrmr commented 3 years ago

Hi @ANTONIOPSD thank you for your report. A 1 gb stl is really big to load in Cura. So the resolution is to high to slice, you could lower the resolution of your model.

It could also be that your STL's are ASCII instead of binary STL's.

ANTONIOPSD commented 3 years ago

Hi @ANTONIOPSD thank you for your report. A 1 gb stl is really big to load in Cura. So the resolution is to high to slice, you could lower the resolution of your model.

It could also be that your STL's are ASCII instead of binary STL's.

Yeah, but Is there any work in progress to improve the handling of big STL files like Simplify3D does? Sometimes the file needs to be that big because of the needed quality. If other slicers can do it, I'm sure the Cura devs can do it. I would help if I could, but I lack the needed skills tocontribute to the code😅

Ghostkeeper commented 3 years ago

We use several libraries to process the input models loaded into Cura. Cura's loading is going to be as slow as those libraries.

There are four operations in Cura that are linear+ in the size of the model, as far as I can think of:

I just tried loading a 1GB ASCII STL file. It took 31.8s to load. Actually not too bad, although I think my file system was cached in RAM there because I had just written the file. And the convex hull of the model is square.

aconz2 commented 2 years ago

I also see very slow load times on an STL with 16 million verts and 5.4 million faces (~260 MB binary format), some timings (taken from the log, line like [JobQueueWorker [2]] UM.FileHandler.ReadFileJob.run [83]: Loading file took 7.0 seconds:

platform version time (s) using numpy-stl?
windows 5.0.0-beta+1 31.6
windows 4.13.1 7.0
linux 5.0.0-beta+1 138.8
linux 4.13.1 40.3

(Hardware note: linux CPU single core benches around 40% faster than on Windows; so I would have hoped for better perf than I see)

For anyone coming across this looking for a workaround, try 3MF format. The same model in 3MF loads in 9.7 seconds on Linux with Cura 5.0

https://github.com/Ultimaker/Uranium/blob/6a23f8d6d00b0286a66018ae995958b81bd180fe/plugins/FileHandlers/STLReader/STLReader.py#L19-L34

Poking around a bit, my suspicion is here:

https://github.com/Ultimaker/Uranium/blob/6a23f8d6d00b0286a66018ae995958b81bd180fe/plugins/FileHandlers/STLReader/STLReader.py#L180-L187

I pulled out the logic for _loadBinary into a test script (see below). I can load the same STL in 3.1 seconds on my machine when there is no time.sleep(0) (the equivalent of Job.yieldThread(). But when you include the time.sleep(0) in every loop iteration, it adds ~50% overhead and gives me 4.2 seconds. And if I were to guess how this behaves in Cura itself, actually having other threads running could increase this overhead by quite a bit.

Is that plausible? Would it be acceptable to only Job.yieldThread every Nth iteration instead of every iteration? Maybe we could do the first thousand iterations to get an approximate speed and then calculate an appropriate yielding frequency to meet whatever responsiveness period you're interested in? Or if you could share more info on CURA-7154 perhaps solving that would be the best solution in this case.

I did not fully look into why 5.0 is 3x slower than 4.13.1 on linux. If it were the Job.yieldThread thing I mentioned, maybe there are just more threads now?

And I also was surprised 5.0 on Windows wasn't using numpy-stl, is that a mistake?

Test script ```python import sys import struct import os import time from typing import cast t1 = time.time() f = open(sys.argv[1], 'rb') f.read(80) # Skip the header num_faces = cast(int, struct.unpack("
smartavionics commented 2 years ago

Hi @aconz2 , intrigued by your post, I did a quick test on my Cura 4 based linux build. I reduced the number of calls to yieldThread() by a factor of 10 in the binary STL loader and was surprised by the results. The total time to load the 86Mbyte file actually increased from around 20 seconds to 30! However the motion of the bouncing blue rectangle at the bottom of the screen became much smoother with the reduced calls to yieldThread(). Here's what I added...

        for idx in range(0, num_faces):
            data = struct.unpack(b"<ffffffffffffH", f.read(50))
            mesh_builder.addFaceByPoints(
                data[3], data[5], -data[4],
                data[6], data[8], -data[7],
                data[9], data[11], -data[10]
            )
            if (idx + 1) % 10 == 0:
                Job.yieldThread()

I shall play some more with this to try and understand the observed behaviour.

smartavionics commented 2 years ago

So, I made a mistake in my previous test in that when it took around 20 seconds to load, that was actually with no calls to yieldThread() in the loop. Testing with calls to yieldThread() every 100 loops took around 21 seconds to load with the blue rectangle moving quite smoothly, with a call every 1000 loops, the loading time is around 20 seconds but the animation is jerky.

So it appears that calling yieldThread() every time around the loop doesn't actually cause a slowdown?

aconz2 commented 2 years ago

@smartavionics I did some follow up testing on Linux and got:

version yielding? time (s)
4.13.1 yes 31.1
4.13.1 no 15.8
5.0b1 yes 105.3
5.0b1 no 16.2

I unpacked each release AppImage with the --appimage-extract flag and then edited the source of STLReader.py directly and inspected the log output.

It does seem mysterious and I wouldn't be surprised if Thread.yield is a red herring in the end.

smartavionics commented 2 years ago

It does seem mysterious and I wouldn't be surprised if Thread.yield is a red herring in the end.

That yieldThread() call just invokes time.sleep(0) which gives other threads a chance to run.

nallath commented 2 years ago

We have found an issue with the 5.0 release; We accidentally forgot to include numpy-stl. This meant that Cura used the (slower) fallback STL loading.

jellespijker commented 2 years ago

@nallath Numpy-STL was part of the requirements, see https://github.com/Ultimaker/cura-build-environment/blob/main/projects/requirements.txt

It could be that pyinstaller didn't collect it.