Open nautilus-a opened 1 month ago
Hello,
During our training we only use square, 512x512 images, so I am unsure how well it will generalize to other resolutions and aspect ratios. The generated point clouds should still be similar to those created by MASt3R though, as we use a mostly unmodified MASt3R model which has been trained with different aspect ratios. If you are using the Gradio demo, there is some code which rescales/crops the images that you might want to check is working correctly for your samples. Otherwise you may need to finetune a version of the model using different aspect ratios
Firstly, thank you for sharing this nice work and its implementation.
I have tried to use this code with the official pre-trained weight to infer the KITTI and Waymo images.
However, I found that the inference results are weird when the input images' sizes are not square.
How can I solve this issue?
The examples below are of KITTI and Waymo. (square inputs makes good results)
KITTI results (512x512 croped two inputs)
square_kitti.webm
KITTI results (1696x512 two inputs)
https://github.com/user-attachments/assets/64924618-0eef-4f73-83b0-ad733869b8dc
Waymo results (512x512 croped two inputs)
https://github.com/user-attachments/assets/79c2b551-a5b5-4db3-8448-00c28cf85630
Waymo results (768x512 two inputs)
https://github.com/user-attachments/assets/178f2378-48d8-403f-9c9c-d12622cd7867