Implementation of the Normal-based Geometry Refinement, especially the Training-free Cross-view Attention?

wyysf-98 / CraftsMan

CraftsMan: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner

https://craftsman3d.github.io/

392 stars 17 forks source link

Implementation of the Normal-based Geometry Refinement, especially the Training-free Cross-view Attention? #25

Open Tuich opened 2 weeks ago

Tuich commented 2 weeks ago

Thank you for your Great work! I am impressed by the fine detail of your work! I am confused about the Implementation of Training-free Cross-view Attention, which you didn't mention much in the paper. So how does the controlnet part use the multi-view feature? Will this part be released?

Thank you very much!!!

Learningm commented 1 week ago

Same question during reading this part, how to implement it in a training-free manner? Is it using mv-dream's multi-view block to replace the original 2d attention block in controlnet ? This way seems necessary to retrain.

Tuich commented 3 days ago

Same question during reading this part, how to implement it in a training-free manner? Is it using mv-dream's multi-view block to replace the original 2d attention block in controlnet ? This way seems necessary to retrain.

do you mean to generate detailed normal maps from all the views in a roll?

Learningm commented 3 days ago

@Tuich Not quite sure, it seems replacing the multi-view block is not enough, since we need to keep the latent space embedding from controlnet same as origin.