wyysf-98 / CraftsMan

CraftsMan: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner
https://craftsman3d.github.io/
430 stars 22 forks source link

No camera conditioning during inferece? #12

Closed rfeinman closed 5 months ago

rfeinman commented 5 months ago

Thanks for the great paper and code!

In the paper it says that your shape diffusion model conditions on camera embeddings in addition to images. But in the code, it looks like you are only inputing the images (see snippet below). Am I missing something? Does your model use the cameras or no? Thanks for clarifying!

https://github.com/wyysf-98/CraftsMan/blob/9be2729bc3564f1ecc165171b4205313f7ace7b1/craftsman/systems/shape_diffusion.py#L330

wyysf-98 commented 5 months ago

Hi, during the inference, if the camera is not provided, we will use the default parameters as in: https://github.com/wyysf-98/CraftsMan/blob/9be2729bc3564f1ecc165171b4205313f7ace7b1/craftsman/models/conditional_encoders/clip_encoder.py#L107-L113. The default camera is defined in https://github.com/wyysf-98/CraftsMan/blob/9be2729bc3564f1ecc165171b4205313f7ace7b1/craftsman/models/conditional_encoders/base.py#L42-L67

This is indeed some complicated, as we want to simplify the inference code. Hope this helps

rfeinman commented 5 months ago

Ah yes I see this now, that makes sense. Thanks for clarifying!