markisus / pytagmapper

Python Mapping library for Fiducial Tags
MIT License
20 stars 2 forks source link

Building map is slow. #2

Closed ZitongLan closed 5 months ago

ZitongLan commented 6 months ago

Hi great work about map building! But I have a problem with build_map.py file. Every time when there is a new map coming in, once the number of [xx/yy] growing large, the optimization become super slow somehow. And If I keep typr ctrl+c evertime the error is almost not decreasing. Then the map building result is bad. Do you have any idea about this?

markisus commented 6 months ago

Thanks for trying out the project.

First I would try --mode 2d if you haven't already and your map lives entirely on a flat surface, or --mode 1.5d if your tags all live in parallel planes.

If that doesn't work, I would be interested in looking at your map data if you would be willing to upload it somewhere.

Other remarks

Usually when this happens, it means the optimization has reached a local optimum that is hard to escape. Any subsequent maps likely further corrupt the map. I have an undocumented tool in pytagmapper/pytagmapper_tools/interactive_optimizer.py that you could try to see what's going on. It requires the PySDL2, pyOpenGL, and pyimgui.

Run python interactive_optimizer.py [your data directory]. In the GUI that pops up, drag around the windows so you can see both "images" and "control".

image

Then [add image] to add the next image manually into the map. image

After adding one more images, check the [optimize] box so that the optimizer will try to reduce the global error. image

You will notice in the images panel, every image comes with some buttons. image

update viewpoint, update tag, are useful when optimize button is unchecked. Then optimization will only affect the estimates of the checked quantities.

reinit 0, ..., reinit 4 are for reinitializing the estimated camera pose to one of five preset initial guesses. reinit tag is for reinitializing the estimated pose of a tag.

This is what I have used to debug the map building process to see if a particular part of my map is hard for the optimizer, and it allows fixing a stable part of the map and only doing further optimizations on unstable parts.

note I don't think there is a button for saving the map produced by this program. This is for debugging

ZitongLan commented 5 months ago

Hi Mark, Thanks for you reply. Actually I am currently using your pytagmapper to build a map of 3d tag map, then based on these map, I will use camera to do localization. However, I find that the build_map.py is not so stable. I understand that it optimizes the tag location by finding the frame with the most number of tags. So I only extract the frame with at least 5 tags. But somehow the building map will stop, as you said. The error will not decrease. What is wrong?

I think I could send you a sample folder containing the following files.

  1. raw MOV files from iphone recording.
  2. script to extract tags from MOV files.
  3. Extracted tag txt files.
  4. camera matrix and distortions coefficients.

Note that the I put the tag in three walls that are perpendicular to each other. But the tags are not show in a propriate directions.

ZitongLan commented 5 months ago

By the way, I change the discoeffs in the solvePnPwrapper function in the build_map.py to the coefficient got by the camera calibrations.

ZitongLan commented 5 months ago

Could you provide a email address so that I can send my file to you?

markisus commented 5 months ago

Please send to [redacted]

ZitongLan commented 5 months ago

Hi @markisus How is everything going? Would it be possible to reconstruct a 3d map of tags without suffering the local minimum errors?

markisus commented 5 months ago

Hi @ZitongLan . I responded to your email, but here is a copy of what I wrote.


Hi Zitong, I started investigating the issue. Just looking at your scene, this should be a very easy map to optimize. I suspect there is something going wrong with the calibration. My testing was done on the Realsense 435 which has almost no distortion.

When I used your distortion parameters and camera matrix to undistort the "map.MOV", the resulting images were even more distorted! See the "undistorted" image below, and focus on the whiteboard edge on the right.

image

Here is the original frame from map.MOV

image

I attached my edits to your code which I used to undistort, which I basically copied from https://docs.opencv.org/4.x/dc/dbb/tutorial_py_calibration.html

Can you check if the distortion parameters are correct? After that, you can try undistorting all images beforehand and use the newcameramtx with 0 distortion within pytagmapper.

(attached code below)

import numpy as np
import cv2
import PIL.Image as Image
import os

###### distortion coeffs and camera matrix taken from basement2 ####
distortion_coeffs = np.array([2.31443426e-01, -1.54574898e+00, 4.34679911e-03, -6.71125680e-04, 3.49512369e+00])
mtx = np.array([
    [1.68540700e+03, 0.00000000e+00, 9.61988173e+02],
    [0.00000000e+00, 1.68671587e+03, 5.35394312e+02],
    [0.00000000e+00, 0.00000000e+00, 1.00000000e+00],
])
######################################################################

video_path = 'basement_new2/map.MOV'
cap = cv2.VideoCapture(video_path)

print(cap)
frames = []
ret = True

save_interval = 50
cnt = 0

dictionary = cv2.aruco.getPredefinedDictionary(cv2.aruco.DICT_APRILTAG_36h11)
parameters =  cv2.aruco.DetectorParameters()

parameters.maxMarkerPerimeterRate = 0.5   # Increase to detect larger markers
parameters.minDistanceToBorder = 3        # Increase to avoid detections too close to the edge
parameters.minMarkerDistanceRate = 0.05   # Increase to reduce false positives
parameters.maxErroneousBitsInBorderRate = 0.35  # Increase for more error tolerance
parameters.errorCorrectionRate = 0.3      # Increase to correct more errors in the detected markers

detector = cv2.aruco.ArucoDetector(dictionary, parameters)

frame_num = 0
file_id = 0
while ret:
    ret, frame = cap.read()
    frame_num += 1
    # print(frame_num)
    if ret:
        # You can process the frame here if needed
        cnt += 1
        if cnt == save_interval:
            cnt = 0
            # image = Image.fromarray(frame[:,:,[2,1,0]], 'RGB')
            image = np.ascontiguousarray(frame)

            # #####################################################################
            # undistort
            # https://docs.opencv.org/4.x/dc/dbb/tutorial_py_calibration.html
            h,  w = image.shape[:2]
            newcameramtx, roi = cv2.getOptimalNewCameraMatrix(mtx, distortion_coeffs, (w,h), 1, (w,h))
            undistorted = cv2.undistort(image, mtx, distortion_coeffs, None, newcameramtx)
            # crop the image
            x, y, w, h = roi
            undistorted = undistorted[y:y+h, x:x+w]
            # save the undistorted camera matrix
            if not os.path.exists('basement_new2/map_img2/camera_matrix.txt'):
                with open('basement_new2/map_img2/camera_matrix.txt', "w") as f:
                    for row in newcameramtx:
                        matrix_row = " ".join(str(d) for d in row)
                        f.write(matrix_row + "\n")
            ########################################################################

            aruco_corners, aruco_ids, aruco_rejected = detector.detectMarkers(undistorted) # <- use undistorted image!
            print(aruco_ids)

            cv2.aruco.drawDetectedMarkers(undistorted, aruco_corners, aruco_ids)
            cv2.imshow('detected frame', undistorted)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break

            if aruco_ids is None or aruco_ids.shape[0] < 4:
                continue

            with open(os.path.join('basement_new2/map_img2/', f"tags_{file_id}.txt"), "w") as f:
                for tag_idx, tag_id in enumerate(aruco_ids):
                    tag_id = tag_id[0]
                    acorners = aruco_corners[tag_idx][0]
                    f.write(f"{tag_id}\n")
                    f.write(f"{acorners[2][0]} {acorners[2][1]}\n")
                    f.write(f"{acorners[3][0]} {acorners[3][1]}\n")
                    f.write(f"{acorners[0][0]} {acorners[0][1]}\n")
                    f.write(f"{acorners[1][0]} {acorners[1][1]}\n")

            # save the corrected image
            image_save = Image.fromarray(undistorted[:, :, ::-1]) # BGR->RGB
            image_save.save(f'basement_new2/map_img2/image_{file_id}.png')

            # save the uncorrected image
            image_save = Image.fromarray(image[:, :, ::-1]) # BGR->RGB
            image_save.save(f'basement_new2/map_img2/raw_image_{file_id}.png')

            file_id += 1
    else:
        break

cv2.destroyAllWindows()

# Save frames to .mat file
# mat_data = {'frames': frames}

# mat_data = np.stack(frames).astype(np.uint8)[::10]

# for i in range(mat_data.shape[0]):
#     image = Image.fromarray(mat_data[i,:,:,:], 'RGB')  # 'RGB' for color images, 'L' for grayscale
#     # Save the image
#     image.save(f'bigger_tag/map_img/image_{i}.png')
markisus commented 5 months ago

@ZitongLan Have not heard back in a while. Have you resolved the issue?

ZitongLan commented 5 months ago

Hi @markisus Thanks for your help! After correct the camera distortion, I find the optimization process become faster. However, when I need to build a large scale of tag mapper, I still encounter some problems. So I turn to some other repos to help. Such as rtabmap, where I can use a ipad with lidar and turn on the marker detection mode to build the tag map. Beyond this, I am finally using UcoSlam, that fuse aruco tag detection into orbslam. Here is the link https://introlab.github.io/rtabmap/ https://sourceforge.net/projects/ucoslam/

Again, thanks for your help!

markisus commented 5 months ago

Glad to hear that you got everything to work! Yes I have also seen that the pytagmapper has a hard time with larger maps -- the method used is a bit experimental, plus the fact that it's built in pure python (slow), and it only looks at tag corners and no other image features, so I'm not surprised that other software can perform better when it comes to larger scenes.

You might also want to check out Colmap https://colmap.github.io/ if you are still comparing different solutions.