AprilRobotics / apriltag

AprilTag is a visual fiducial system popular for robotics research.
https://april.eecs.umich.edu/software/apriltag
Other
1.61k stars 539 forks source link

quick_decode struct allocates 6Gb memory for standard52h13 family #343

Closed kde-baskets-user closed 2 months ago

kde-baskets-user commented 3 months ago

Describe the bug quick_decode_init() allocates too much memory for the standard52h13 family. Any workarounds?

To Reproduce Create standard52h13 detector to see allocated memory.

Expected behavior Parsing of 15 bits of useful data from an image should take more reasonable amount of memory.

Operating Sytem Ubuntu

Installation Method I built AprilTag from source following the instructions in the README

Code version 3.4.2 from GitHub

christian-rauch commented 3 months ago

I can confirm the high memory consumption by running ./opencv_demo --camera 2 --family tagStandard52h13 and checking the memory consumption with top -p $(pidof opencv_demo), which reports around 6GB RES consumption.

The tagStandard52h13 has one of the largest code books (48714 codes). Loading an even larger code book from the tagCircle49h12 (65535 codes) will consume even more memory (more than 7GB in my case).

Parsing of 15 bits of useful data from an image should take more reasonable amount of memory.

How do you get to 15 bits? As the name tagStandard52h13 suggests, this tag family encodes 52 bits with a hamming distance of 13.

If you need to reduce the size of the code book, I simply suggest using a different tag family with fewer encoded bits. You can also generate your own tag family for specific requirements.

You can also adjust the bits_corrected parameter, which is set by default to 2 here: https://github.com/AprilRobotics/apriltag/blob/3806edf38ac4400153677e510c9f9dcb81f472c8/apriltag.h#L240-L246

As you can see from the comment, and the code actually, the higher the bits_corrected / maxhamming value, the higher the capacity and nentries and thus the more quick_decode_entry structs are allocated.

kde-baskets-user commented 3 months ago

Thanks for your answer. Since tagStandard52h13 has 48714 different entries, one aprilag code contains 15.6 = log2(48724) bits of information, by "useful data" I meant rigorously defined word "information", sorry for any confusion. Everything else in 52 bits representation has to do with error correction, I assume.

Now, while memory depending on ncodes grows only polynomially(and not even quadratically, as far as I see), 6Gb for many embedded applications is prohibitively costly especially for a seemingly modest task of error correction, but I am not expert in this field.

I am currently evaluating several 2D barcode libraries and this is so far a show-stopper for apriltag. I am going to run some more tests and perhaps will be back with more data on this.

christian-rauch commented 3 months ago

Thanks for your answer. Since tagStandard52h13 has 48714 different entries, one aprilag code contains 15.6 = log2(48724) bits of information, by "useful data" I meant rigorously defined word "information", sorry for any confusion. Everything else in 52 bits representation has to do with error correction, I assume.

Yes, most of the bits are for the hamming distance, e.g. to allow a certain amount of bit errors before codes / IDs are mixed up. If you need many IDs with a large hamming distance, then you just have to use many bits. If you do not need to have codes with at least a certain hamming distance, you can just use a 16bit tag family.

Now, while memory depending on ncodes grows only polynomially(and not even quadratically, as far as I see), 6Gb for many embedded applications is prohibitively costly especially for a seemingly modest task of error correction, but I am not expert in this field.

I am currently evaluating several 2D barcode libraries and this is so far a show-stopper for apriltag. I am going to run some more tests and perhaps will be back with more data on this.

I would be curious to see alternative solutions to this.

If I may, why are you trying to distinguish that many unique IDs in an embedded application? If memory is a concern, why are you not using a smaller tag family? E.g. the tag36h11 has more than 500 unique IDs with a minimum hamming distance of 11 and consumes far less memory.