Closed ZitongLan closed 5 months ago
Thanks for trying out the project.
First I would try --mode 2d if you haven't already and your map lives entirely on a flat surface, or --mode 1.5d if your tags all live in parallel planes.
If that doesn't work, I would be interested in looking at your map data if you would be willing to upload it somewhere.
Other remarks
Usually when this happens, it means the optimization has reached a local optimum that is hard to escape. Any subsequent maps likely further corrupt the map. I have an undocumented tool in pytagmapper/pytagmapper_tools/interactive_optimizer.py
that you could try to see what's going on. It requires the PySDL2, pyOpenGL, and pyimgui.
Run python interactive_optimizer.py [your data directory]
. In the GUI that pops up, drag around the windows so you can see both "images" and "control".
Then [add image] to add the next image manually into the map.
After adding one more images, check the [optimize] box so that the optimizer will try to reduce the global error.
You will notice in the images panel, every image comes with some buttons.
update viewpoint, update tag, are useful when optimize button is unchecked. Then optimization will only affect the estimates of the checked quantities.
reinit 0, ..., reinit 4 are for reinitializing the estimated camera pose to one of five preset initial guesses. reinit tag is for reinitializing the estimated pose of a tag.
This is what I have used to debug the map building process to see if a particular part of my map is hard for the optimizer, and it allows fixing a stable part of the map and only doing further optimizations on unstable parts.
note I don't think there is a button for saving the map produced by this program. This is for debugging
Hi Mark, Thanks for you reply. Actually I am currently using your pytagmapper to build a map of 3d tag map, then based on these map, I will use camera to do localization. However, I find that the build_map.py is not so stable. I understand that it optimizes the tag location by finding the frame with the most number of tags. So I only extract the frame with at least 5 tags. But somehow the building map will stop, as you said. The error will not decrease. What is wrong?
I think I could send you a sample folder containing the following files.
Note that the I put the tag in three walls that are perpendicular to each other. But the tags are not show in a propriate directions.
By the way, I change the discoeffs in the solvePnPwrapper function in the build_map.py to the coefficient got by the camera calibrations.
Could you provide a email address so that I can send my file to you?
Please send to [redacted]
Hi @markisus How is everything going? Would it be possible to reconstruct a 3d map of tags without suffering the local minimum errors?
Hi @ZitongLan . I responded to your email, but here is a copy of what I wrote.
Hi Zitong, I started investigating the issue. Just looking at your scene, this should be a very easy map to optimize. I suspect there is something going wrong with the calibration. My testing was done on the Realsense 435 which has almost no distortion.
When I used your distortion parameters and camera matrix to undistort the "map.MOV", the resulting images were even more distorted! See the "undistorted" image below, and focus on the whiteboard edge on the right.
Here is the original frame from map.MOV
I attached my edits to your code which I used to undistort, which I basically copied from https://docs.opencv.org/4.x/dc/dbb/tutorial_py_calibration.html
Can you check if the distortion parameters are correct? After that, you can try undistorting all images beforehand and use the newcameramtx with 0 distortion within pytagmapper.
(attached code below)
import numpy as np
import cv2
import PIL.Image as Image
import os
###### distortion coeffs and camera matrix taken from basement2 ####
distortion_coeffs = np.array([2.31443426e-01, -1.54574898e+00, 4.34679911e-03, -6.71125680e-04, 3.49512369e+00])
mtx = np.array([
[1.68540700e+03, 0.00000000e+00, 9.61988173e+02],
[0.00000000e+00, 1.68671587e+03, 5.35394312e+02],
[0.00000000e+00, 0.00000000e+00, 1.00000000e+00],
])
######################################################################
video_path = 'basement_new2/map.MOV'
cap = cv2.VideoCapture(video_path)
print(cap)
frames = []
ret = True
save_interval = 50
cnt = 0
dictionary = cv2.aruco.getPredefinedDictionary(cv2.aruco.DICT_APRILTAG_36h11)
parameters = cv2.aruco.DetectorParameters()
parameters.maxMarkerPerimeterRate = 0.5 # Increase to detect larger markers
parameters.minDistanceToBorder = 3 # Increase to avoid detections too close to the edge
parameters.minMarkerDistanceRate = 0.05 # Increase to reduce false positives
parameters.maxErroneousBitsInBorderRate = 0.35 # Increase for more error tolerance
parameters.errorCorrectionRate = 0.3 # Increase to correct more errors in the detected markers
detector = cv2.aruco.ArucoDetector(dictionary, parameters)
frame_num = 0
file_id = 0
while ret:
ret, frame = cap.read()
frame_num += 1
# print(frame_num)
if ret:
# You can process the frame here if needed
cnt += 1
if cnt == save_interval:
cnt = 0
# image = Image.fromarray(frame[:,:,[2,1,0]], 'RGB')
image = np.ascontiguousarray(frame)
# #####################################################################
# undistort
# https://docs.opencv.org/4.x/dc/dbb/tutorial_py_calibration.html
h, w = image.shape[:2]
newcameramtx, roi = cv2.getOptimalNewCameraMatrix(mtx, distortion_coeffs, (w,h), 1, (w,h))
undistorted = cv2.undistort(image, mtx, distortion_coeffs, None, newcameramtx)
# crop the image
x, y, w, h = roi
undistorted = undistorted[y:y+h, x:x+w]
# save the undistorted camera matrix
if not os.path.exists('basement_new2/map_img2/camera_matrix.txt'):
with open('basement_new2/map_img2/camera_matrix.txt', "w") as f:
for row in newcameramtx:
matrix_row = " ".join(str(d) for d in row)
f.write(matrix_row + "\n")
########################################################################
aruco_corners, aruco_ids, aruco_rejected = detector.detectMarkers(undistorted) # <- use undistorted image!
print(aruco_ids)
cv2.aruco.drawDetectedMarkers(undistorted, aruco_corners, aruco_ids)
cv2.imshow('detected frame', undistorted)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
if aruco_ids is None or aruco_ids.shape[0] < 4:
continue
with open(os.path.join('basement_new2/map_img2/', f"tags_{file_id}.txt"), "w") as f:
for tag_idx, tag_id in enumerate(aruco_ids):
tag_id = tag_id[0]
acorners = aruco_corners[tag_idx][0]
f.write(f"{tag_id}\n")
f.write(f"{acorners[2][0]} {acorners[2][1]}\n")
f.write(f"{acorners[3][0]} {acorners[3][1]}\n")
f.write(f"{acorners[0][0]} {acorners[0][1]}\n")
f.write(f"{acorners[1][0]} {acorners[1][1]}\n")
# save the corrected image
image_save = Image.fromarray(undistorted[:, :, ::-1]) # BGR->RGB
image_save.save(f'basement_new2/map_img2/image_{file_id}.png')
# save the uncorrected image
image_save = Image.fromarray(image[:, :, ::-1]) # BGR->RGB
image_save.save(f'basement_new2/map_img2/raw_image_{file_id}.png')
file_id += 1
else:
break
cv2.destroyAllWindows()
# Save frames to .mat file
# mat_data = {'frames': frames}
# mat_data = np.stack(frames).astype(np.uint8)[::10]
# for i in range(mat_data.shape[0]):
# image = Image.fromarray(mat_data[i,:,:,:], 'RGB') # 'RGB' for color images, 'L' for grayscale
# # Save the image
# image.save(f'bigger_tag/map_img/image_{i}.png')
@ZitongLan Have not heard back in a while. Have you resolved the issue?
Hi @markisus Thanks for your help! After correct the camera distortion, I find the optimization process become faster. However, when I need to build a large scale of tag mapper, I still encounter some problems. So I turn to some other repos to help. Such as rtabmap, where I can use a ipad with lidar and turn on the marker detection mode to build the tag map. Beyond this, I am finally using UcoSlam, that fuse aruco tag detection into orbslam. Here is the link https://introlab.github.io/rtabmap/ https://sourceforge.net/projects/ucoslam/
Again, thanks for your help!
Glad to hear that you got everything to work! Yes I have also seen that the pytagmapper has a hard time with larger maps -- the method used is a bit experimental, plus the fact that it's built in pure python (slow), and it only looks at tag corners and no other image features, so I'm not surprised that other software can perform better when it comes to larger scenes.
You might also want to check out Colmap https://colmap.github.io/ if you are still comparing different solutions.
Hi great work about map building! But I have a problem with build_map.py file. Every time when there is a new map coming in, once the number of [xx/yy] growing large, the optimization become super slow somehow. And If I keep typr ctrl+c evertime the error is almost not decreasing. Then the map building result is bad. Do you have any idea about this?