3dlg-hcvc / paris

[ICCV 2023] Official implementation of the paper "PARIS: Part-level Reconstruction and Motion Analysis for Articulated Objects"
https://3dlg-hcvc.github.io/paris/
MIT License
59 stars 2 forks source link

how to make a real-world dataset #12

Closed GuoJunfu-tech closed 2 months ago

GuoJunfu-tech commented 2 months ago

Could you share the methods you have employed to extract real-world data from the Multi-Scan dataset? Furthermore, how have you aligned the world coordinates across different states? Also, I'm curious about the process you use to obtain the ground truth for your data. Thanks!

SevenLJY commented 2 months ago

Hi @GuoJunfu-tech. Thanks for your interest in our work.

I used the scanner app and processing server provided by MultiScan to scan, collect, and annotate the real-world data. For the details, you can refer to their doc to download and deploy their tools.

For object alignment across states in the world coordinates, I did it manually for each data. Specifically, I aligned the meshes using the ICP alignment tool from Meshlab.

Hope it helps. Thanks!

GuoJunfu-tech commented 2 months ago

Many thanks!

GuoJunfu-tech commented 2 months ago

One more thing, how do you process the image to extract the main object? Do you use some semantic segmentation method like mask r-cnn or just manually delete the background?

SevenLJY commented 2 months ago

To construct the GT data, I simply projected the 3D mesh onto the image to obtain the object mask for each view. This ensures consistency across views. But from my experience, the mask accuracy is not affecting the performance that much.

However, the off-the-shelf 2D segmentation models (e.g., SAM) should also work well or even better to segment the object in 2D than my way since the reconstructed mesh is not perfect.

GuoJunfu-tech commented 2 months ago

Thank you so much! It has resolved my confusion.