threedle / GeoCode

GeoCode maps 3D shapes to a human-interpretable parameter space, allowing to intuitively edit the recovered 3D shapes from a point cloud or sketch input.
379 stars 24 forks source link

working with real-world images and complex backgrounds #10

Open abbhinavvenkat opened 1 year ago

abbhinavvenkat commented 1 year ago

Hi,

I'm looking at training and testing GeoCode from real world images with complex backgrounds.

  1. Do you have any suggestions on how I should go about creating the ground truth for this? What all modalities do I need to train this model?
  2. For testing the your existing model (say for chairs) from real world images, I presume I have to convert the image to sketches. What do you recommend for this? Do I need anything else apart from sketches to perform this step?

Thanks in advance!

Regards, Abbhinav

ofekp commented 1 year ago

If you want to test the existing model with real images, then converting the images to sketches is the best option I can think of right now. To generate the sketches you could explore classical algorithms or recent papers related to the subject, perhaps a combination of the two. If these fail to produce good results, I would also think of a segmentation model to extract the object and then convert the object to a sketch. Given a dataset, I would start with an image encoder to see how it performs before exploring anything else. Creating such a dataset could be tricky. To preserve the labels of the shapes as much as possible, I would probably go the way of generating scenes involving the objects as synthetic data. You will have to explore ways to bridge the domain gap if such a dataset will not have a good performance on real images.