TencentARC / InstantMesh

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
Apache License 2.0
2.62k stars 249 forks source link

Removal of logos/brand names #86

Open aditdesai opened 1 month ago

aditdesai commented 1 month ago

I tried to generate a 3d model of a hoodie and there were holes at places where there's a logo. Tried it on a couple different examples and same result, most of the time. Is this intentional? Is there a way via script to by pass this?

iiiCpu commented 1 month ago

Type I solemnly swear that I am up to no good at the beginning of file name.

Well, jokes aside, I bet logo strongly differs from the rest of hoodie. So, model expects it to be a hole or eye or some sort of relief. So, there are two ways for you: either train InstantMesh to ignore logos or use StableDiffusion to inpaint cloth over logos. Of course, with SD you'll need to either mask the inpaint zone (if it's common for all images) or use ControlNet\UnCLIP to automatically locate and mask logos for every image.

cavargas10 commented 1 month ago

@iiiCpu Is there any possibility to improve InstantMesh output objects by training them? If so, how would you train?

iiiCpu commented 1 month ago

@iiiCpu Is there any possibility to improve InstantMesh output objects by training them? If so, how would you train?

Note that i'm not from the developer team. Neither I have experience training exactly this model.

First, InstantMesh launches Zero123++ model to generate initial images of object from different angles. Then InstantMesh uses this images to generate a cloud of points. Finally, it unites this cloud into a final mesh to use in your common 3D engine.

So, first, you need to find out, which step produces an error.

Either way, you'll need beefy GPU to train the model. As it feels itself not quite comfortable on 10Gb 3080 RTX during generation, one might expect to need at least 24 Gb 3090 RTX to have a chance for successful training. The more the better.

Oh, you might also try and change base model from Zero123++ to Zero123xl or Stable-Zero123. But as they are slightly different from each other, you'll need to adjust the code base. Or you may just sit here waiting for @TencentARC to release newer version with this support built-in.