openphotogrammetry / colmap-cl

COLMAP-CL: An OpenCL implementation of COLMAP photogrammetry
62 stars 8 forks source link

Feature Mapping Issue #3

Open seanhart6708 opened 3 years ago

seanhart6708 commented 3 years ago

I am using a small sample set of images for my first project in COLMAP-CL, and all seems to be going well until the feature mapping stage. At this point, the message "Matching Block [1/1, 1/1]" appears. The program continues running, but no progress is made. I believe the issue may be happening because I'm using COLMAP-CL, which is why I'm posting on this forum. A screenshot of the situation I keep running into is attached below. Screenshot (1)

revisionarian commented 3 years ago

@seanhart6708, thanks for your posting. It sounds like COLMLAP-CL is struggling to utilize the OpenCL resources on your GPU. Your screenshot shows your OpenCL device as "Caicos", which is a pretty old AMD GPU. Here's some things that would help us debug this:

  1. Are you using the latest version (v1.1) of COLMAP-CL? The older versions had very slow feature matching speed.
  2. Do you know which model of AMD GPU you have (e.g. Radeon HD 6450, Radeon R5 310) ?
  3. Can you try running a new reconstruction with the "Quality" level set to "Low" or "Medium" instead of "High" ?
  4. Can you run "clinfo" (available at the bottom of this page) and post the output here so we can see the OpenCL specs of your GPU?

We'd like for COLMAP-CL to run successfully on any OpenCL platform, but most of our testing so far has been on more recent GPUs than the Caicos generation. It might take us some days to figure out the problem, so if you need photogrammetry results quickly, you might try another software like Regard3D that doesn't require a fast GPU.

seanhart6708 commented 3 years ago

Thanks for the fast response. I have the latest version of COLMAP-CL, and adjusting the quality of the reconstruction does not change the issue. My computer has an AMD Radeon HD 7450 GPU. Here is the output when I run clinfo:

Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.0 AMD-APP (1800.5) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices

Platform Name: AMD Accelerated Parallel Processing Number of devices: 2 Device Type: CL_DEVICE_TYPE_GPU Vendor ID: 1002h Board name: AMD Radeon HD 7450 Device Topology: PCI[ B#1, D#0, F#0 ] Max compute units: 2 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 0 Native vector width char: 16 Native vector width short: 8 Native vector width int: 4 Native vector width long: 2 Native vector width float: 4 Native vector width double: 0 Max clock frequency: 625Mhz Address bits: 32 Max memory allocation: 536870912 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 8 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 1024 Alignment (bits) of base address: 2048 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: No Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 1073741824 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Max pipe arguments: 0 Max pipe active reservations: 0 Max pipe packet size: 0 Max global variable size: 0 Max global variable preferred total size: 0 Max read/write image args: 0 Max on device events: 0 Queue on device max size: 0 Max on device queues: 0 Queue on device preferred size: 0 SVM capabilities: Coarse grain buffer: No Fine grain buffer: No Fine grain system: No Atomics: No Preferred platform atomic alignment: 0 Preferred global atomic alignment: 0 Preferred local atomic alignment: 0 Kernel Preferred work group size multiple: 64 Error correction support: 0 Unified memory for Host and Device: 0 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue on Host properties: Out-of-Order: No Profiling : Yes Queue on Device properties: Out-of-Order: No Profiling : No Platform ID: 00007FFD40EEF180 Name: Caicos Vendor: Advanced Micro Devices, Inc. Device OpenCL C version: OpenCL C 1.2 Driver version: 1800.5 (VM) Profile: FULL_PROFILE Version: OpenCL 1.2 AMD-APP (1800.5) Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_image2d_from_buffer_read_only cl_khr_spir cl_khr_gl_event

Device Type: CL_DEVICE_TYPE_CPU Vendor ID: 1002h Board name: Max compute units: 6 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 1024 Max work group size: 1024 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 8 Preferred vector width double: 4 Native vector width char: 16 Native vector width short: 8 Native vector width int: 4 Native vector width long: 2 Native vector width float: 8 Native vector width double: 4 Max clock frequency: 3491Mhz Address bits: 64 Max memory allocation: 2629078016 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 64 Max image 2D width: 8192 Max image 2D height: 8192 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 4096 Alignment (bits) of base address: 1024 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 64 Cache size: 16384 Global memory size: 10516312064 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Global Local memory size: 32768 Max pipe arguments: 16 Max pipe active reservations: 16 Max pipe packet size: 2629078016 Max global variable size: 1879048192 Max global variable preferred total size: 1879048192 Max read/write image args: 64 Max on device events: 0 Queue on device max size: 0 Max on device queues: 0 Queue on device preferred size: 0 SVM capabilities: Coarse grain buffer: No Fine grain buffer: No Fine grain system: No Atomics: No Preferred platform atomic alignment: 0 Preferred global atomic alignment: 0 Preferred local atomic alignment: 0 Kernel Preferred work group size multiple: 1 Error correction support: 0 Unified memory for Host and Device: 1 Profiling timer resolution: 100 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: Yes Queue on Host properties: Out-of-Order: No Profiling : Yes Queue on Device properties: Out-of-Order: No Profiling : No Platform ID: 00007FFD40EEF180 Name: AMD FX(tm)-6120 Six-Core Processor Vendor: AuthenticAMD Device OpenCL C version: OpenCL C 1.2 Driver version: 1800.5 (sse2,avx,fma4) Profile: FULL_PROFILE Version: OpenCL 1.2 AMD-APP (1800.5) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_spir cl_khr_gl_event

revisionarian commented 3 years ago

@seanhart6708, thanks for gathering that information. Your GPU is not a fast one by modern standards, but it should be able to run COLMAP-CL. I'm not sure if COLMAP-CL is freezing up when you enter the Feature Matching phase, or if it is just taking it a long time to compute the result. These ideas should help us determine what is going on:

  1. Are you giving it plenty of time to compute before you give up? If you have over 100,000 features to match, it could take longer than just several minutes.
  2. If you set the "block_size" parameter to the low value of "2", then you will see more incremental output messages that allow you to monitor the progress of the Feature Matching process.
  3. You can uncheck the "use_gpu" box to use your computer's CPU instead of the GPU. Does the Feature Matching work when you use the CPU?
  4. Does the matching process work if you make a very small dataset, say a subset of 3-4 images from your project?

Your reported OpenCL driver version (1800.5) is quite old, so you might consider updating your graphics driver.

seanhart6708 commented 3 years ago

Ok, it seems like we're making progress! Before, I was definitely giving it enough time to compute (several hours), but it wasn't working. Once I set the block_size to 2 and reduced the data set to 4 images, as well as unchecked "use_gpu", I was able to get a sparse reconstruction of the model.

However, once I moved on to the stereo phase of dense reconstruction, Colmap immediately crashed (it happened right after I hit the "stereo" button). The following message was displayed in my command line:

QObject::~QObject: Timers cannot be stopped from another thread

I did a bit of researching and found that the problem may be related to a lack of GPU memory, although there are no definitive answers out there. My GPU has 4864 MB of available memory and my computer has 9.79 GB of available RAM. Do you think the problem may be related to memory? If so, let me know if there's anything I may be able to do to fix it. If not, let me know if you have any thoughts as to what the problem may be.

revisionarian commented 3 years ago

I don't think that the problem is a lack of memory. According to the "clinfo" output above, your GPU has 1 GB of memory (Global memory size: 1073741824). COLMAP-CL would output a useful error message if memory allocation failed.

Another COLMAP-CL user reported a very similar problem to yours; see Issue #2. They also had a lesser-powered GPU, an integrated Intel iGPU in their case.

Unfortunately, I don't think that you are going to be able to successfully run the current version of COLMAP-CL on your GPU to perform dense reconstruction. Our team needs to investigate and fix the problem in our software and release a new version of COLMAP-CL that can accommodate older GPUs like the Radeon HD 7450. It will take us a while to do this.

I would recommend that you try different software to process your images. There are several photogrammetry programs that do not require any GPU at all for dense reconstruction; examples of such free software include Regard3D, OpenDroneMap, MicMac, 3DF Zephyr (free for less than 50 images), and others.

If you would be willing to do further testing to help us debug COLMAP-CL, I can send send you a special instrumented version of COLMAP-CL that will tell us exactly which line of code is causing the crash on your system.

seanhart6708 commented 3 years ago

Thanks so much for your help, I'd be glad to help with debugging if you send me the specialized version. You're right, it seems somebody else had a very similar issue to me. In the meantime I'll check out those other software programs as well. You can contact me at seanmhart24@gmail.com so we don't have to keep posting on the issues page during further testing. Or we can open a discussion under this project on GitHub-- whatever seems best