Closed borongyuan closed 1 year ago
I created a new repository to test superpoint on OAK camera. I haven't decoded the inference results yet, just to roughly evaluate the performance first. When the StereoDepth node is enabled, 10 of the 16 shave cores are available for NeuralNetwork nodes. So I compile the model for 10 shaves (single thread) or 5 shaves (2 threads). The inference time for 320x200 input is about 63ms using single thread. This means we can run superpoint on the OAK camera at least at 15FPS. We can also use 2 threads to get higher FPS, at the cost of higher latency.
It is very interesting that SuperPoint could run on device. For this function: https://github.com/introlab/rtabmap/blob/af481fc73074ae13faaa08511ee6d5e36e805b65/corelib/include/rtabmap/core/SensorData.h#L273
The arguments are in order of priority.
keypoints
: odometry won't re-detect 2d keypoints. So if you provide already GFTT features (or any other opencv keypoints), that step will be skipped.keypoints3D
(should be same size than keypoints
): the 3D location of the 2D keypoints in base frame (i.e. in ros coordinate system, not camera frame). Stereo correspondences will be skipped (in case of stereo) or 3D projection of 2d keypoints will be skipped (in case of RGB-D). descriptors
(should have same number of rows than keypoints
size): the coresponding descriptors to 2D keypoints. Descriptors won't be re-extracted if provided.Notes:
Vis/EstimationType=1
(default PnP), 3D keypoints vector is not required, but recommended to get better covariance estimation.Hi, I have added superpoint heatmap visualization in the example, looks good. https://github.com/borongyuan/depthai-superpoint
DepthAI has a FeatureTracker node for feature detection and tracking. So I took a look to see if it is possible to directly add features to SensorData. Then I found that there is indeed a setFeatures() method, which accepts 3 inputs. It looks like not all 3 items are required. So I'm trying to understand how this should be used? What is the effect of different combinations of inputs?
Specifically, FeatureTracker just does Harris or Shi-Tomasi (GFTT) corner detection. Therefore only 2D keypoints can be provided. The last 2 inputs are empty. Is this going to work? Will subsequent programs calculate descriptors for these detected points? Do I need to modify other parts of the code or parameters?
Also, some people are trying to run SuperPoint directly on OAK camera. It would be very helpful if this worked.