OpenDroneMap / ODM

A command line toolkit to generate maps, point clouds, 3D models and DEMs from drone, balloon or kite images. 📷
https://opendronemap.org
GNU Affero General Public License v3.0
4.88k stars 1.11k forks source link

OpenSfM reconstruct exception #895

Closed merkato closed 5 years ago

merkato commented 6 years ago

How did you install OpenDroneMap? (Docker, natively, ...)?

Installed by Docker - WebODM. There was working instance here, but used webodm.sh update today.

What's your browser and operating system?

Chrome 69, Xubuntu 18.04

What is the problem?

After starting small dataset (29 images), that was successful previously, WebODM throws error:

[DEBUG] running PYTHONPATH=/code/SuperBuild/install/lib/python2.7/dist-packages /code/SuperBuild/src/opensfm/bin/opensfm create_tracks /var/www/data/6ab4757d-d8b2-4edd-84c0-d1ceedfa6903/opensfm 2018-09-18 10:48:58,944 INFO: reading features 2018-09-18 10:49:00,516 DEBUG: Merging features onto tracks 2018-09-18 10:49:01,902 DEBUG: Good tracks: 32182 [DEBUG] running PYTHONPATH=/code/SuperBuild/install/lib/python2.7/dist-packages /code/SuperBuild/src/opensfm/bin/opensfm reconstruct /var/www/data/6ab4757d-d8b2-4edd-84c0-d1ceedfa6903/opensfm 2018-09-18 10:49:06,349 INFO: Starting incremental reconstruction Traceback (most recent call last): File "/code/SuperBuild/src/opensfm/bin/opensfm", line 34, in <module> command.run(args) File "/code/SuperBuild/src/opensfm/opensfm/commands/reconstruct.py", line 21, in run report = reconstruction.incremental_reconstruction(data) File "/code/SuperBuild/src/opensfm/opensfm/reconstruction.py", line 1161, in incremental_reconstruction pairs = compute_image_pairs(common_tracks, data) File "/code/SuperBuild/src/opensfm/opensfm/reconstruction.py", line 416, in compute_image_pairs result = parallel_map(_compute_pair_reconstructability, args, processes) File "/code/SuperBuild/src/opensfm/opensfm/context.py", line 38, in parallel_map return list(e.map(func, args)) File "/usr/local/lib/python2.7/dist-packages/loky/process_executor.py", line 788, in _chain_from_iterable_of_lists for element in iterable: File "/usr/local/lib/python2.7/dist-packages/loky/_base.py", line 589, in result_iterator yield future.result() File "/usr/local/lib/python2.7/dist-packages/loky/_base.py", line 433, in result return self.__get_result() File "/usr/local/lib/python2.7/dist-packages/loky/_base.py", line 381, in __get_result raise self._exception loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker. Traceback (most recent call last): File "/code/run.py", line 47, in <module> plasm.execute(niter=1) File "/code/scripts/run_opensfm.py", line 133, in process (context.pyopencv_path, context.opensfm_path, tree.opensfm)) File "/code/opendm/system.py", line 34, in run raise Exception("Child returned {}".format(retcode)) Exception : Child returned 1

What should be the expected behavior? If this is a feature request, please describe in detail the changes you think should be made to the code, citing files and lines where changes should be made, if possible.

Old good OpenDroneMap ;)

How can we reproduce this? (What steps did you do to trigger the problem? What parameters are you using for processing? If possible please include a copy of your dataset uploaded on Google Drive or Dropbox. Be detailed)

Using this dataset: https://github.com/merkato/odm_mygla_dataset

merkato commented 6 years ago

zaznaczenie_140 Another dataset, standard settings.

pierotofy commented 6 years ago

@merkato are you able to process the dataset if you lower the --max-concurrency parameter?

Could you copy/paste the output result of sudo lshw -short?

I've just processed odm_mygla without errors on a 24 GB machine.

pierotofy commented 6 years ago

image

pierotofy commented 6 years ago

Related: http://community.opendronemap.org/t/process-exited-with-code-1-adjusted-memory-min-num-features-but-failed-in-the-middle-of-processing-pls-help/1130

It looks like _compute_pair_reconstructability has increased memory usage quite a bit since the last OpenSfM update. Will need to investigate and possibly remove / decrease the parallelism.

bertramt commented 6 years ago

I'm also having the same issue I'm unable to process the odm_mygla dataset successfully. Ubuntu 18.04 with a 128GB of RAM running as a VM on a Proxmox server. WebODM is running as a docker install. I've tried as many as 52 CPUs but current the VM is set for 4 CPUs. I have tried setting max concurrency =1 with no luck.

If it helps at all, here is my lswh

H/W path Device Class Description =========================================================== system Standard PC (i440FX + PIIX, 1996) /0 bus Motherboard /0/0 memory 96KiB BIOS /0/400 processor Common KVM processor /0/1000 memory 128GiB System Memory /0/1000/0 memory 16GiB DIMM RAM /0/1000/1 memory 16GiB DIMM RAM /0/1000/2 memory 16GiB DIMM RAM /0/1000/3 memory 16GiB DIMM RAM /0/1000/4 memory 16GiB DIMM RAM /0/1000/5 memory 16GiB DIMM RAM /0/1000/6 memory 16GiB DIMM RAM /0/1000/7 memory 16GiB DIMM RAM /0/100 bridge 440FX - 82441FX PMC [Natoma] /0/100/1 bridge 82371SB PIIX3 ISA [Natoma/Triton II] /0/100/1.1 storage 82371SB PIIX3 IDE [Natoma/Triton II] /0/100/1.2 bus 82371SB PIIX3 USB [Natoma/Triton II] /0/100/1.2/1 usb1 bus UHCI Host Controller /0/100/1.2/1/1 input QEMU USB Tablet /0/100/1.3 bridge 82371AB/EB/MB PIIX4 ACPI /0/100/2 display VGA compatible controller /0/100/3 generic Virtio memory balloon /0/100/3/0 generic Virtual I/O device /0/100/8 communication Virtio console /0/100/8/0 generic Virtual I/O device /0/100/a storage Virtio block device /0/100/a/0 /dev/vda disk 128GB Virtual I/O device /0/100/a/0/1 /dev/vda1 volume 1023KiB BIOS Boot partition /0/100/a/0/2 /dev/vda2 volume 119GiB EXT4 volume /0/100/12 network Virtio network device /0/100/12/0 ens18 network Ethernet interface /0/100/1e bridge QEMU PCI-PCI bridge /0/100/1f bridge QEMU PCI-PCI bridge /0/1 scsi1 storage /0/1/0.0.0 /dev/cdrom disk QEMU DVD-ROM /1 vethd75339e network Ethernet interface /2 veth2fc11a2 network Ethernet interface /3 br-3abc8ee2face network Ethernet interface /4 vethc564d39 network Ethernet interface /5 docker0 network Ethernet interface /6 vethcbc2015 network Ethernet interface /7 veth2f4f537 network Ethernet interface

pierotofy commented 6 years ago

Looks like this is not a memory problem, but a compilation problem.

A user reported that a Illegal Instruction exception is being thrown by OpenSfM. This indicates a dependency of OpenSfM is being compiled with certain optimizations which might not be available on certain CPUs.

http://community.opendronemap.org/t/process-exited-with-code-1-adjusted-memory-min-num-features-but-failed-in-the-middle-of-processing-pls-help/1130/5

This is likely affecting a lot of users.

Matronek commented 6 years ago

So where do we go from here? I'm just glad this is a bug and not something I did because this is the first time installing ODM. I thought I would try it out on Ubuntu 18.04 and it's been frustrating.

pierotofy commented 6 years ago

I'll double check our automated build process, perhaps we left out a parameter last time we built the docker images. I'll post an update in a few hours.

bertramt commented 6 years ago

It worked on Proxmox with a KVM CPU a couple weeks ago. Hopefully it's an easy fix.

pierotofy commented 6 years ago

Can everyone who was affected by this error try to do a ./webodm.sh update and try to process their datasets again? Is the problem still there?

owentorgerson commented 6 years ago

I would say it was a success!! Processed minutes ago...

marina3d

bertramt commented 6 years ago

I would agree. Mine processed the odm_mygla dataset just fine now

pierotofy commented 6 years ago

Thank you all for your help. I think we mistakenly built the docker images without the cross compilation flags enabled last time. I will close this, feel free to reopen if the problem persists.

smathermather commented 6 years ago

👍

owentorgerson commented 6 years ago

Attempted a larger model

=====

[INFO] Found 181 usable images [DEBUG] running /code/build/bin/odm_extract_utm -imagesPath /var/www/data/9bf80328-e2e8-4b1c-8c2c-119f10b27c07/images/ -imageListFile /var/www/data/9bf80328-e2e8-4b1c-8c2c-119f10b27c07/img_list.txt -outputCoordFile /var/www/data/9bf80328-e2e8-4b1c-8c2c-119f10b27c07/odm_georeferencing/coords.txt -logFile /var/www/data/9bf80328-e2e8-4b1c-8c2c-119f10b27c07/odm_georeferencing/odm_georeferencing_utm_log.txt Warning: Directory Photo has an unexpected next pointer; ignored. Warning: Directory Iop has an unexpected next pointer; ignored. Warning: Directory GPSInfo has an unexpected next pointer; ignored.

========

Resulting in....

Error loading: /var/www/data/9bf80328-e2e8-4b1c-8c2c-119f10b27c07/images/qq-expansion_047.JPG Traceback (most recent call last): File "/code/run.py", line 47, in plasm.execute(niter=1) File "/code/scripts/smvs.py", line 67, in process system.run('%s %s %s' % (context.makescene_path, tree.mve_path, tree.smvs)) File "/code/opendm/system.py", line 34, in run raise Exception("Child returned {}".format(retcode)) Exception: Child returned 1

Thoughts?

merkato commented 6 years ago

Once again: updated from docker today. Lenovo T500, Xubuntu 18.04, docker webodm.

Here we go a step further:

`Created 29 views with 27 valid cameras. Imported 27 undistorted images. [DEBUG] running /code/SuperBuild/src/elibs/smvs/app/smvsrecon -t2 -a1.0 --max-pixels=409600 -o2 --debug-lvl=0 /var/www/data/b283e184-5ec0-4650-9a6d-e4c7bf0312bf/smvs Shading-aware Multi-view Stereo (built on Sep 26 2018, 17:54:18)

Initializing scene with 27 views... Initialized 27 views (max ID is 28), took 4ms. Reading Photosynther file (29 cameras, 30984 features)... Automatic input scale: 2 Input embedding: undist-L2 Output embedding: smvs-B2 Running view selection for 27 views... done, took 3.019s. Skipping 2 views with insufficient number of neighbors. Skipped IDs: 1 2 Resizing input images for 25 views... done, took 7.618s. Starting 1/25 ID: 0 Neighbors: 26 14 22 13 21 10 Starting 2/25 ID: 3 Neighbors: 16 27 5 11 28 15 Illegal instruction (core dumped) Traceback (most recent call last): File "/code/run.py", line 47, in plasm.execute(niter=1) File "/code/scripts/smvs.py", line 82, in process system.run('%s %s %s' % (context.smvs_path, ' '.join(config), tree.smvs)) File "/code/opendm/system.py", line 34, in run raise Exception("Child returned {}".format(retcode)) Exception : Child returned 132`

"It looks like this computer might be too old. WebODM requires a computer with a 64-bit CPU supporting MMX, SSE, SSE2, SSE3 and SSSE3 instruction set support or higher. You can still run WebODM if you compile your own docker images. See this page for more information."

lswh output: `H/W path Device Class Description

                              system         2056A24

/0 bus 2056A24 /0/0 memory 128KiB BIOS /0/6 processor Intel(R) Core(TM)2 Duo CPU
/0/6/a memory 64KiB L1 cache /0/6/c memory 3MiB L2 cache /0/b memory 64KiB L1 cache /0/2b memory 8GiB System Memory /0/2b/0 memory 4GiB SODIMM Synchronous 1066 MH /0/2b/1 memory 4GiB SODIMM Synchronous 1066 MH /0/100 bridge Mobile 4 Series Chipset Memory /0/100/1 bridge Mobile 4 Series Chipset PCI Exp /0/100/1/0 generic Illegal Vendor ID /0/100/2 display Mobile 4 Series Chipset Integra /0/100/3 communication Mobile 4 Series Chipset MEI Con /0/100/3.2 storage Mobile 4 Series Chipset PT IDER /0/100/3.3 communication Mobile 4 Series Chipset AMT SOL /0/100/19 enp0s25 network 82567LM Gigabit Network Connect /0/100/1a bus 82801I (ICH9 Family) USB UHCI C /0/100/1a/1 usb3 bus UHCI Host Controller /0/100/1a/1/1 input USB Receiver /0/100/1a.1 bus 82801I (ICH9 Family) USB UHCI C /0/100/1a.1/1 usb4 bus UHCI Host Controller /0/100/1a.1/1/2 communication ThinkPad Bluetooth with Enhance /0/100/1a.2 bus 82801I (ICH9 Family) USB UHCI C /0/100/1a.2/1 usb5 bus UHCI Host Controller /0/100/1a.7 bus 82801I (ICH9 Family) USB2 EHCI /0/100/1a.7/1 usb1 bus EHCI Host Controller /0/100/1b multimedia 82801I (ICH9 Family) HD Audio C /0/100/1c bridge 82801I (ICH9 Family) PCI Expres /0/100/1c.1 bridge 82801I (ICH9 Family) PCI Expres /0/100/1c.1/0 wlp3s0 network PRO/Wireless 5100 AGN [Shiloh] /0/100/1c.3 bridge 82801I (ICH9 Family) PCI Expres /0/100/1c.4 bridge 82801I (ICH9 Family) PCI Expres /0/100/1d bus 82801I (ICH9 Family) USB UHCI C /0/100/1d/1 usb6 bus UHCI Host Controller /0/100/1d.1 bus 82801I (ICH9 Family) USB UHCI C /0/100/1d.1/1 usb7 bus UHCI Host Controller /0/100/1d.2 bus 82801I (ICH9 Family) USB UHCI C /0/100/1d.2/1 usb8 bus UHCI Host Controller /0/100/1d.7 bus 82801I (ICH9 Family) USB2 EHCI /0/100/1d.7/1 usb2 bus EHCI Host Controller /0/100/1e bridge 82801 Mobile PCI Bridge /0/100/1e/0 bridge RL5c476 II /0/100/1e/0.1 bus R5C832 IEEE 1394 Controller /0/100/1e/0.2 generic R5C822 SD/SDIO/MMC/MS/MSPro Hos /0/100/1e/0.4 generic R5C592 Memory Stick Bus Host Ad /0/100/1e/0.5 generic xD-Picture Card Controller /0/100/1f bridge ICH9M-E LPC Interface Controlle /0/100/1f.2 storage 82801IBM/IEM (ICH9M/ICH9M-E) 4 /0/100/1f.3 bus 82801I (ICH9 Family) SMBus Cont /0/1 scsi2 storage
/0/1/0.0.0 /dev/sda disk 1TB ST1000LM048-2E71 /0/1/0.0.0/1 /dev/sda1 volume 186GiB EXT4 volume /0/1/0.0.0/2 /dev/sda2 volume 745GiB Extended partition /0/1/0.0.0/2/5 /dev/sda5 volume 186GiB EXT4 volume /0/1/0.0.0/2/6 /dev/sda6 volume 558GiB EXT4 volume /0/2 scsi3 storage
/0/2/0.0.0 /dev/cdrom disk DVDRAM GSA-U20N /1 power COMPATIBLE /2 veth5a60cce network Ethernet interface /3 br-e5bdfd493b0c network Ethernet interface /4 veth281865d network Ethernet interface /5 veth052543d network Ethernet interface /6 docker0 network Ethernet interface /7 vethc022610 network Ethernet interface /8 veth92edae9 network Ethernet interface `

merkato commented 6 years ago

Status: Failed Options: dsm: true, max-concurrency: 1, rerun-from: dataset

sudo lshw -class processor *-cpu
description: CPU product: Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz vendor: Intel Corp. physical id: 6 bus info: cpu@0 version: Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz slot: None size: 911MHz capacity: 2401MHz width: 64 bits clock: 266MHz capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx x86-64 constant_tsc arch_perfmon pebs bts nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm pti tpr_shadow vnmi flexpriority dtherm ida cpufreq

pierotofy commented 6 years ago

Thank you @merkato. This will need to be fixed upstream or through some modifications on our build scripts. We will try to fix it upstream as a first attempt.

https://github.com/flanggut/smvs/pull/31

pierotofy commented 6 years ago

Current temporary workaround is to use --use-opensfm-dense.

bertramt commented 6 years ago

"It looks like this computer might be too old. WebODM requires a computer with a 64-bit CPU supporting MMX, SSE, SSE2, SSE3 and SSSE3 instruction set support or higher. You can still run WebODM if you compile your own docker images. See this page for more information."

@merkato I didn't try since the update but I happened to get that error when I set max-concurrency: 1 When set to 2 or the default I didn't get that particular error

merkato commented 6 years ago

With max-concurrency set to 1 or 2 I have same error. I understand that my ThinkPad T500 is old, but I like it. Waiting for some upstream fixes, maybe doesn't need to buy a new computer ;)

boronian commented 5 years ago

I have the same error using normal ODM (not WebODM) on Docker in Windows... lshw doesn't work in my environment, I guess... or can I somehow install that? Windows 10 pro, Docker Desktop, allocated 15 threads, 12GB RAM ...

Last bits of working process and full error message was:

Termination: CONVERGENCE 2019-05-15 16:06:21,837 INFO: Adding DJI_c1_0354.JPG to the reconstruction 2019-05-15 16:06:21,948 INFO: Re-triangulating Killed Traceback (most recent call last): File "/code/run.py", line 47, in <module> plasm.execute(niter=1) File "/code/scripts/run_opensfm.py", line 133, in process (context.pyopencv_path, context.opensfm_path, tree.opensfm)) File "/code/opendm/system.py", line 34, in run raise Exception("Child returned {}".format(retcode))

EDIT: as suggested by @pierotofy I tried again with --use-opensfm-dense (and I increased the SWAP in the docker settings). This time it got much further but crashed again... I think at the last steps of Orthophoto creation... I guess I'll post a new case for that, though as it seems to be independend.

smathermather commented 5 years ago

This issue is getting a bit confused. I am going to close this. @boronian -- please open another one with lots of details of version, error, things tried. Thanks!