openphotogrammetry / meshroomcl

MeshroomCL: An OpenCL implementation of photogrammetry with the Meshroom interface
Other
163 stars 8 forks source link

"WARNING: Could not fuse any points." and no GPU #60

Open RedGlow opened 5 months ago

RedGlow commented 5 months ago

Hello,

I'm getting the error in topic for all the tests I'm trying to do with MeshroomCL (Monstree, skull-turntable-strong-lights-no-background-dotted, and a custom set of photos). The error pops up during the stereo fusion step of MultiviewStereoCL. The kind of error is similar to https://github.com/openphotogrammetry/meshroomcl/issues/50 , but I do actually have a AMD Radeon RX 6600 (see below), and to this reddit thread, in which people found no solution.

If I check "Specify OpenCL devices" on the StructureFromMotionCL node I get only one: _0:0_AMD_Accelerated_Parallel_Processinggfx1032 (which sounds correct), but once I start processing, the Statistics tab shows lots of work on the CPU and RAM usage, but the GPU section has a laconic "No GPU", so I suspect something actually went wrong?

I have updated my AMD drivers to the latest version. In attachment, the MeshroomCache directory, where I trimmed the biggest files (normals, depths, reconstruction db, images, ...) but keeping log files and maybe something more too.

I don't know if this is enough information to work on, please ask away if you need something more. Thanks!

revisionarian commented 5 months ago

Hi @RedGlow, thanks for trying out MeshroomCL and posting your experience here. I especially appreciate you posting your MeshroomCache directory, which will help us diagnose the problem faster.

The "Statistics" tab in MeshroomCL is not working properly for AMD GPUs (this is a bug in MeshroomCL), so you will not see accurate GPU statistics there. However, I can see from your StructureFromMotionCL and MultiviewStereoCL node log files that the GPU was working properly to accelerate the computation of those nodes.

The problem appears to be in the MultiviewStereoCL node. This node is supposed to perform four sub-tasks:

  1. Undistort the images based on the intrinsic camera calibration parameters;
  2. Compute depth and normal maps for each image;
  3. Fuse the depth images into a single point cloud; and
  4. Mesh the point cloud into a polygonal surface mesh.

Based on the log files, it appears that steps 1 and 2 are working fine, but the fusion step is not fusing any points from the depth maps. To debug this, I think we should look at the contents of the depth and normal maps to verify if they have reasonable data in them.

Can you post a few of your depth and normal map files? You will find these at the pathnames MeshroomCache/MultiviewStereoCL/*/stereo/depth_maps/1188756022.JPG.geometric.bin and MeshroomCache/MultiviewStereoCL/*/stereo/normal_maps/1188756022.JPG.geometric.bin, respectively. Perhaps post 3 files each of the depth and normal maps, and we can look and see if they are empty or if they contain fusable data.

RedGlow commented 5 months ago

Hello, Thanks for the support. I was able to compress the whole depth_maps and normal_maps folder of the only two subdirectories I have under MultiviewStereoCL. Given the incredibly high compression rate, I suspect the resulting files to be mostly empty indeed.

da5229e2bad7669a9069a65cec635dff51a30cc6.zip f3fd7035952b9e1bc29be9ad526152699d067d98.zip

revisionarian commented 5 months ago

Hi @RedGlow, yes, the depth maps you shared are mostly empty. All of the *geometric.bin depth maps are completely empty, and approximately half of the *photometric.bin maps are completely empty. Of those that have any data, this is the one (213788255.JPG.photometric.bin) with the most valid depth values:

213788255-depthmap

As you can see, it is just some sparse noisy lines. Does this pattern match any of the structure in your original images at all? Do you think you get similar depth maps with the "Monstree" dataset?

So, now we know that the problem is likely in the depth map creation, rather than the fusion itself. Let me think about what we can do next to fix this problem.

revisionarian commented 5 months ago

Hi @RedGlow, here's an idea. Could you run the Mostree/mini6 dataset (only 6 images) through the MeshroomCL pipeline, and post the resulting MeshroomCache? You could reduce the image resolution by half (down to 2016x1512) to make it run faster and create smaller outptut.

If we use the same dataset, I can directly compare your buggy results to the results that we are getting on our machines.

RedGlow commented 5 months ago

Hello @revisionarian , here is the meshroom cache with this experiment (split it into two because of GitHub attachment limits): MeshroomCache1.zip MeshroomCache2.zip

One more thing that could or could not be useful: I see that MeshroomCL is able to extract points in the 3d environment and show them. I don't know if this is any useful info or not: image

revisionarian commented 5 months ago

Hi @RedGlow, thanks very much for sending your output from the "Mostree-mini6" experiment!

We are now able to replicate the problem that you are observing with MeshroomCL. We observe that the problem occurs with more recent AMD GPU drivers (such as Adrenaline Edition version 23.12.1), but the problem does not exist with older AMD drivers (such as Adrenaline version 22.10.1, Windows Driver Store Version 31.0.12027, from 2022). So, you could probably run MeshroomCL successfully if you switched to an older AMD Radeon driver version.

Our team will keep working to identify the incompatibility of MeshroomCL with the newer drivers, and provide a fix.

RedGlow commented 5 months ago

Thanks a lot! In the meanwhile I will try an older version of the drivers as soon as possible and write if this workaround succeeds, in case other people have the same problem.

revisionarian commented 5 months ago

Hi @RedGlow, we have further narrowed down the incompatibility with MeshroomCL and the more recent AMD Radeon drivers.

MeshroomCL will work with AMD driver versions 23.5.2 and earlier. AMD Adrenalin driver version 23.5.2 corresponds to Windows Driver Store Version 31.0.14057.5006, and was released on June 1, 2023. MeshroomCL does not currently work with AMD driver versions 23.7.1 and later. AMD Adrenalin driver version 23.7.1 corresponds to Windows Driver Store Version 31.0.21001.45002 and was released on July 6, 2023.

So I think you just need to roll back your AMD display driver to a version from before July. Hopefully we will be able to determine the cause of this problem soon, and release a new version of MeshroomCL that is compatible with all AMD drivers.

RedGlow commented 5 months ago

@revisionarian I've made a couple of tests and can confirm that driver's version 23.5.2 works correctly and produces 3d models for the full monstree set, the reduced one, and my own set of photos. I'll stick to this driver version until the bug is fixed. Thanks again!

DaObst commented 2 months ago

I'm facing the same issue with this error:

WARNING: Could not fuse any points. This is likely caused by incorrect settings - filtering must be enabled for the last call to patch match stereo. Number of fused points: 0 Elapsed time: 0.392 [minutes]

I'm using an RX 7800XT with the newest Adrenaline Driver - So I stumbled over this thread. Any chance that this can be fixed without downgrading to an old driver? Anything from before 23.11.x sadly is no option for me.