wannesm / dtaidistance

Time series distances: Dynamic Time Warping (fast DTW implementation in C)
Other
1.08k stars 184 forks source link

rowwise matching of images #170

Closed Janphr closed 2 years ago

Janphr commented 2 years ago

Hey, I want to color lidar data with camera footage. Unfortunately, the camera is mounted on a gimbal, so the transform between the lidar scanner and the camera changes with every capture. But since the lidar is capable to generate ambient/near infrared images, I thought I might be able to at least match the rows between the two sensors, by treating the change between adjacent pixels of each row as a signal. Here's an example of an ambient image from the Lidar: ambient and here is the camera image: camera The horizontal resolution of the ambient image is half of the vertical resolution, so I applied some scaling to the camera image and cropped the lidar image to equalize them a little more. I also applied histogram equalization: ambient_ camera_ I still have to apply perspective transformation to the camera image. The lidar image has a cylindrical projection. So I'm sure you should be able to equalize them even more. Also in the final setup later, the camera image should always lay inside the lidar image.

So I thought I just so something like this:

for ref_row in range(ref_img_eq_16.shape[0]):
        ref_signal = np.convolve(np.abs(np.subtract(ref_img_eq_16[ref_row], np.roll(ref_img_eq_16[ref_row], 1))), kernel, mode='same')
        min_dist = np.finfo(np.double).max
        for temp_row in range(temp_img_eq_16.shape[0]):
            temp_signal = np.convolve(np.abs(np.subtract(temp_img_eq_16[temp_row], np.roll(temp_img_eq_16[temp_row], 1))), kernel, mode='same')

            if col_diff:
                for shift in range(col_diff):
                    d = dtw.distance_fast(temp_signal, np.roll(ref_signal, -shift)[:-col_diff],
                                          use_pruning=True,
                                          # window=col_diff,
                                          max_dist=min_dist,
                                          max_step=10)

                    if d < min_dist:
                        min_dist = d
                        min_dist_col = shift
                        min_dist_row = temp_row

Simply find the pixel coordinates in the lidar image, where the row of the camera image best fits, but so far the results aren't too great...I also smooth the rows with a kernel size of about 100, which helps a little.

Do you guys have any ideas of how to get this working? warp

Thanks!

Ne-oL commented 2 years ago

(not the developer) but have you tried LCSS? it seems more appropriate for your use case?

Janphr commented 2 years ago

Hey, thanks for the tip. Do you have any specific LCSS implementation in mind?

I think I'll have to invest more time into equalizing the two images, but since the outcome is unclear, I'll have to suspend this for now. I'll probably temporarily just mount a cheap camera on the lidar until I find a solution for this...

wannesm commented 2 years ago

Two additional thoughts:

  1. Searching for a subsequence where the length might be unknown or not matching is easier using the subsequence alignment strategy.
  2. The values seem to have different absolute values (because they express something slightly different). This often causes shrinking points like in your graph. You seem to want to align the relative shape more than the absolute values. You can try to use normalization or differencing (and smoothing) like in the DBA clustering example in the tutorial: https://dtaidistance.readthedocs.io/en/latest/usage/clustering.html#k-means-dba-clustering . For other examples on shrinking points, see Keogh et al 01.
Janphr commented 2 years ago

1) Looks very promising! 2) I played around with differencing and z normalization already and it seemed to improve the result at least in some rows

I'll definitely try the subsequence matching. Maybe just search a few strong matches and calculate the homography from it.

Thanks!