jpcy / xatlas

Mesh parameterization / UV unwrapping library
MIT License
2.09k stars 220 forks source link

Optimisation for high-load input #125

Open Korinin38 opened 1 year ago

Korinin38 commented 1 year ago

This fork is still a bit rough, but is actively developed as of now. There are some important changes, and more strict STL dependencies, so I suggest leave this PR unmerged and instead be used as a link for those who are interested.

Motivation

Big scale projects (photo-realistic objects with >1M faces) can be processed with xatlas, with a caveat of either significant time cost (if using BruteForce packing method) or sub-optimal result (if using Random packing method). The purpose of this fork is to speed up the time of computation of BruteForce while retaining the packing efficiency, or combine two methods to get optimal result in a satisfying time.

To easily compare productivity of old and new versions, see https://github.com/Korinin38/xatlas-comparison.

Additions

Computing charts

No changes.

Packing charts

Changes

lcc815 commented 10 months ago

Hi, @Korinin38

Thanks for your great work. But when I tried your repo, I cannot build successfully and get following error:

xatlas_static.make:182: recipe for target 'obj/x86_64/Release/xatlas_static/xatlas.o' failed
make[1]: *** [obj/x86_64/Release/xatlas_static/xatlas.o] Error 1
Makefile:120: recipe for target 'xatlas_static' failed
make: *** [xatlas_static] Error 2

Did I do any thing wrong? Any help is appreciated.

Korinin38 commented 10 months ago

Hi, @lcc815

Have you tried building current version of xatlas without these contributions? If yes, is this error unique to xatlas_hi-res_optimised?

lcc815 commented 10 months ago

Hi, @lcc815

Have you tried building current version of xatlas without these contributions? If yes, is this error unique to xatlas_hi-res_optimised?

Yes! I can build current version of xatlas without these contributions successfully.

lcc815 commented 10 months ago

@Korinin38 hello, could you please try solving this? please.. Current version of xatlas is just too slow to use.

Korinin38 commented 10 months ago

Hi, @lcc815, sorry for keeping you waiting,

I examined the problem. Seems like the error appears due to addition of exception handling, it would be fixed soon. For immediate result, try changing the parameter exceptionhandling from "Off" to "On" in both "xatlas" and "xatlas_static" projects in premake5.lua, or use the patch that does just that.

Please contact me whether that helped or not.

lcc815 commented 10 months ago

@Korinin38 yes! It works! I can build and run your code successfully just like the official code. However, the time cost is also as the same as the time cost of running the official code : ) What I do is running these: ./build/gmake/bin/x86_64/Release/example my_mesh.obj I believe I did some thing wrong, could you please tell me how to use your code to speed up this wrapping process. Again, thanks for your reply!

Korinin38 commented 10 months ago

I'll see into it. Can you provide logs for both launches? It would help me greatly.

Bear in mind that there are two independent steps: computing charts and packing charts, and if the slowest step is the former, I can do nothing about it as it is not changed in any way.

lcc815 commented 10 months ago

Aha, I got it. The bottleneck is computing charts. packing charts do cost less time than the official code. Thanks a lot!

siliconvoodoo commented 3 weeks ago

I pulled your branch and did 3 tests with a city block model.

original xatlas: 73 seconds korinin XA_USE_GPU off CUDA_SUPPORT off openMP unset: 169 seconds korinin XA_USE_GPU off CUDA_SUPPORT off openMP on: didn't finish after 30 minutes. force stopped.

I didn't try CUDA because of build setup complexities.

Korinin38 commented 3 weeks ago

Hello @siliconvoodoo, thank you for reaching out,

Is it possible for you to provide the example model? If not, could you share details such as number of vertices & faces, as well as number of charts parametrized with xatlas? It's also useful to know which PackOptions you used, and whether the results of packing are satisfactory in both finished tests.

siliconvoodoo commented 2 weeks ago

It's a model with 52 meshes. 4,274,763 vertices, 7.5M triangles. With original xatlas the viewer gives these stats image

But today I tried to build with OpenCL. That was very hard because your branch doesn't include the dependencies. So I found them on your other repository xatlas-comparison. Went and installed Cuda tookit to have the libopencl.

Went to cmake your 3 libraries (clew/gpu/misc) Manually added includes and link dependencies (just a note this PR won't get accepted because it breaks the standalone principle of xatlas (only 1 .h and .cpp))

you miss some template exports: template OpenCLKernelArg::OpenCLKernelArg(const gpu::shared_device_buffer_typed<unsigned __int64>& arg);

the first blit.cl I tried had old content and didn't build. activating printLog = true, showed:

Device 1 Program build log:

:16:7: error: use of undeclared identifier 'uint32_t' for (uint32_t y = 0; y < ch; y++) { ^ ...

fixed that for nothing because then, there was no aggregateResults entry point.

found the more recent blit.cl, then it was this error:

:304:32: error: call to 'max' is ambiguous const unsigned int extentX = max(w, offset_x + chartSizes[0]); ^~~ cl_kernel.h:3498:22: note: candidate function int __OVERLOADABLE__ max(int, int); ^ cl_kernel.h:3499:23: note: candidate function uint __OVERLOADABLE__ max(uint, uint);

I fixed with casts.

then the execution failed because the call site missed arguments: kernels["blitLevel"].exec missing w, h, so I added them and retried

But then execution failed at a further point, with this log:

{_Ptr=0x0000000024f730a0 "Kernel aggregateResults: CL_UNKNOWN_ERRORCODE-9999 (-9999) at line 368" }

Different exception from in viewer project:

{_Ptr=0x000001ec1a3f1bc0 "Global work_size[0] value is zero!" }

I only got it to work on a cornel box model or a model with two cubes.