mdaiter / openMVG

openMVG with a LATCH descriptor, an ORB descriptor, DEEP descriptors from the cvpr15compare repo, PNNet/Torch loader and a GPU-based L2 matcher integrated
Other
31 stars 20 forks source link

Runtime crash when using GPU accelerated descriptors #10

Closed donlk closed 8 years ago

donlk commented 8 years ago

When using LATCH_UNSIGNED: - EXTRACT FEATURES - 0% 10 20 30 40 50 60 70 80 90 100% |----|----|----|----|----|----|----|----|----|----| Using max kepoints: 30720 terminate called after throwing an instance of 'cereal::Exception' what(): Trying to save a registered polymorphic type with an unregistered polymorphic cast. Could not find a path to a base class (openMVG::features::Image_describer) for type: openMVG::features::LATCH_Image_describer Make sure you either serialize the base class at some point via cereal::base_class or cereal::virtual_base_class. Alternatively, manually register the association with CEREAL_REGISTER_POLYMORPHIC_RELATION.

or LATCH_BINARY: - EXTRACT FEATURES - 0% 10 20 30 40 50 60 70 80 90 100% |----|----|----|----|----|----|----|----|----|----| rapidjson internal assertion failure: IsObject() Cannot dynamically allocate the Image_describer interface.

DEEP_SIAM and PNNET options are also failing due to the missing deep network binaries you removed recently.

mdaiter commented 8 years ago

Hey @donlk , Are you sure PNNET fails? Is it not DEEPSIAM2STREAM failing?

I think I know what's up with this. Weird that it's not picking that up. I think it's just a register problem. Let me contact you in about an hour with updates.

donlk commented 8 years ago

Yes it does: You called : /mnt/linuxdata/Development/work/projects/sfmrecon/built/linux-x86_64/install/OpenMVG/bin/openMVG_main_ComputeFeatures --input_file /mnt/linuxdata/Development/work/projects/sfmrecon/testsets/cookie/calculation/sfm/matches/sfm_data.json --outdir /mnt/linuxdata/Development/work/projects/sfmrecon/testsets/cookie/calculation/sfm/matches --describerMethod PNNET --upright 0 --describerPreset HIGH --force 0 --numThreads 7 EXTRACT FEATURES 0% 10 20 30 40 50 60 70 80 90 100% |----|----|----|----|----|----|----|----|----|----| From inside DeepClassifierTHNets, opening: /home/nomoko/Code/openMVG/src/openMVG/features/deep/networks/pnnet/ lasterror Couldn't load network

I'm gonna check out your posts tomorrow, its running late in central Europe.

mdaiter commented 8 years ago

@donlk Oh, I see the issue. That's a minor bug dealing with hardcoded paths. Let me upload a position-ambiguous version. Also, what branch are you on?

mdaiter commented 8 years ago

@donlk PNNet bug should be fixed.

donlk commented 8 years ago

I'm on your custom branch.

mdaiter commented 8 years ago

@donlk Can you pull and compile?

donlk commented 8 years ago

Allright. Also can you give me the source repo of the latch submodule you're using? I'm unable to commit my compute level changes because its not part of the openMVG repo.

mdaiter commented 8 years ago

@donlk https://www.github.com/mdaiter/cudaLATCH

mdaiter commented 8 years ago

@donlk also - that problem with cereal not recognising the LATCH describer is strange, because I explicitly register it here: https://github.com/mdaiter/openMVG/blob/custom/src/openMVG/features/image_describer_latch.hpp

Can you tell give me a set to reproduce these errors?

mdaiter commented 8 years ago

@donlk I'll compile it and run some tests tomorrow. I know you're trying to get to bed. Good night!

mdaiter commented 8 years ago

@donlk found the problem! Could you add this line: CEREAL_REGISTER_POLYMORPHIC_RELATION(openMVG::features::Image_describer, openMVG::features::LATCH_Image_describer); right after this line: https://github.com/mdaiter/openMVG/blob/custom/src/openMVG/features/image_describer_latch.hpp#L185 I'd add it myself, but if I add it to my repo my version of Cereal breaks. Thanks!

mdaiter commented 8 years ago

@donlk did that fix it?

donlk commented 8 years ago

Testing now. Sorry, mistakenly deleted my previous comment.

donlk commented 8 years ago

Still not working, but got a slightly different message: - EXTRACT FEATURES - 0% 10 20 30 40 50 60 70 80 90 100% |----|----|----|----|----|----|----|----|----|----| Using max kepoints: 30720 terminate called after throwing an instance of 'cereal::Exception' what(): Trying to save a registered polymorphic type with an unregistered polymorphic cast. Could not find a path to a base class (openMVG::features::Regions) for type: openMVG::features::Scalar_Regions<openMVG::features::SIOPointFeature, unsigned int, 64ul> Make sure you either serialize the base class at some point via cereal::base_class or cereal::virtual_base_class. Alternatively, manually register the association with CEREAL_REGISTER_POLYMORPHIC_RELATION.

donlk commented 8 years ago

It seems the problem is quite trivial. Pierre updated cereal to 1.2.0 and that introduced additional required relational definitions for feature detectors breaking your fork. See here: https://github.com/openMVG/openMVG/commit/ff22ff265da0789b785fca9206b812a04c844c6e So all we got to do is add the necessary polymorphic relations to latch and deep classifiers in regions_factory.hpp

mdaiter commented 8 years ago

@donlk that appears to be correct.

donlk commented 8 years ago

I may have a pull request for you soon including this little fix and hopefully one for the network binary file path issue: https://github.com/mdaiter/openMVG/commit/58483973f0e446995fed31025de2b49afd767185 Relative path doesn't really work there, for whenever it runs the base path in relation is going to be the binary invocation path.

donlk commented 8 years ago

Allright, here's another gem. I've put in the missing declarations, but it only got me so far: Running latch Ran latch Copying memory back Memory copied 3/37 | 30720 features | 27ms OpenCV Error: Assertion failed (!scalar.empty() || (src2.type() == src1.type() && src2.size() == src1.size())) in arithm_op, file /mnt/linuxdata/Development/work/projects/sfmrecon/3rdparty/OpenCV/modules/cudaarithm/src/element_operations.cpp, line 141 terminate called after throwing an instance of 'cv::Exception' OpenCV Error: Assertion failed (0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows) in GpuMat, file /mnt/linuxdata/Development/work/projects/sfmrecon/3rdparty/OpenCV/modules/core/src/cuda_gpu_mat.cpp, line 152 OpenCV Error: Assertion failed (!scalar.empty() || (src2.type() == src1.type() && src2.size() == src1.size())) in arithm_op, file /mnt/linuxdata/Development/work/projects/sfmrecon/3rdparty/OpenCV/modules/cudaarithm/src/element_operations.cpp, line 141 what(): /mnt/linuxdata/Development/work/projects/sfmrecon/3rdparty/OpenCV/modules/cudaarithm/src/element_operations.cpp:141: error: (-215) !scalar.empty() || (src2.type() == src1.type() && src2.size() == src1.size()) in function arithm_o

mdaiter commented 8 years ago

Which parameters did you pass to the binary?

donlk commented 8 years ago

What do you mean? I'm running it with LATCH_UNSIGNED. Or you mean the network binary paths? Are they used with the LATCH descriptor?

mdaiter commented 8 years ago

@donlk the only place where I call OpenCV code is here: https://github.com/mdaiter/cudaLATCH/blob/695ffdcb68dd8214aafdc368593fc67749a3645c/LatchClassifierOpenMVG.cpp#L113-L127 Are you sure you're not passing in any empty images?

mdaiter commented 8 years ago

@donlk well, and here: https://github.com/mdaiter/cudaLATCH/blob/695ffdcb68dd8214aafdc368593fc67749a3645c/LatchClassifierOpenMVG.cpp#L68-L95 but it's effectively the same code.

donlk commented 8 years ago

My bad. I was running the feature detection in parallel. ORB detector in OpenCV has gone haywire :) LATCH is working fine now and its crazy fast. I'm seeing more than an order of magnitude faster detection time compared to SIFT here with roughly the same number of features on a GTX 970 .

mdaiter commented 8 years ago

@donlk Oh yeah! That totally happened all the time with me. It's a massive pain.

donlk commented 8 years ago

OK, not much long for that pull request now.

donlk commented 8 years ago

Also, a couple of questions:

csp256 commented 8 years ago

The GPU brute force Hamming matcher was designed to be used with GPU accelerated LATCH descriptors. I don't know how to call it (lol), all I did was write it. @mdaiter integrated it into OpenMVG. Make sure that you use a threshold which is reasonable for binary descriptors; not one that is designed for float descriptors.

donlk commented 8 years ago

Pull: https://github.com/mdaiter/openMVG/pull/11 Path fix for deep network binaries has to wait until tomorrow.