Are there any command line options that can improve the resolution

LIU-Yuxin / SyncMVD

Official PyTorch & Diffusers implementation of "Text-Guided Texturing by Synchronized Multi-View Diffusion"

MIT License

134 stars 9 forks source link

Are there any command line options that can improve the resolution #9

Open jloveric opened 7 months ago

jloveric commented 7 months ago

Some generations seem a bit grainy, is there a way from the command line to increase the resolution? I want to see if there is anything I can do here before trying to get this to work with SDXL.

LIU-Yuxin commented 7 months ago

You can try latent view size of 128 (check the config.py file). Though it seemed that using an even higher resolution using SD1.5 might have negative impact on the image quality?

jloveric commented 7 months ago

--latent_tex_size=2048 --latent_view_size=192 seems to help significantly for my use case

LIU-Yuxin commented 7 months ago

Interesting. Would you mind sharing the use case you are working on, and the comparisons before and after the resolution change if possible? By the way, did you face any memory issue when you go for a larger resolution?

jloveric commented 7 months ago

Buildings. So far I've gotten ride of the prompts for "side", "front", "back" and use 4 cameras above (3 at 3 different angles and one straight above) and then I use 3 on the bottom. Yes, I run out of memory on a 4090 an the above parameters are basically the limits of what 24GB Vram GPU can handle. I also use a building fine tuned model from CivitAI instead of standard SD.

bdcms1 commented 4 months ago

Buildings. So far I've gotten ride of the prompts for "side", "front", "back" and use 4 cameras above (3 at 3 different angles and one straight above) and then I use 3 on the bottom. Yes, I run out of memory on a 4090 an the above parameters are basically the limits of what 24GB Vram GPU can handle. I also use a building fine tuned model from CivitAI instead of standard SD.

@jloveric I'm also working on building texture generation. Could you provide more clues about the use of prompts for "side," "front," "back"? In addition, the model you mentioned was trained and fine-tuned from CivitAI through what process exactly? Could you provide more advice if possible? It would be great if you could share the model files. Thank you very much.

bdcms1 commented 4 months ago

@jloveric @LIU-Yuxin I have another question. When the building model is very simple, even represented by just a simple cuboid, the generated textures for such a simple building appear too basic. The facade of the building may just have a plain color without any structural details such as windows or balconies being generated. I would like to hear some suggestions on how to improve this. Thank you.

LIU-Yuxin commented 4 months ago

@jloveric @LIU-Yuxin I have another question. When the building model is very simple, even represented by just a simple cuboid, the generated textures for such a simple building appear too basic. The facade of the building may just have a plain color without any structural details such as windows or balconies being generated. I would like to hear some suggestions on how to improve this. Thank you.

I have updated an extra hyper parameter in the latest commit. It will help the generation of some fine details when the sd+controlnet model is uncertain about the object appearance, for example, the chair sample in the readme page. You may try it out, but not very certain if it will help in your specific case.

jloveric commented 4 months ago

@bdcms1 This is one of my forks of the repo where I have some of my changes https://github.com/LIU-Yuxin/SyncMVD/compare/main...jloveric:SyncMVD:main and you can see it looks like I used architecturerealmix_v1repair from civitAI (I may have found a version on hugging face...). Also there is one other parameter I had to switch, I believe it was "depth" vs "normal" for the controlnet. You need to use "depth" if I recall.

bdcms1 commented 4 months ago

@jloveric @LIU-Yuxin I have another question. When the building model is very simple, even represented by just a simple cuboid, the generated textures for such a simple building appear too basic. The facade of the building may just have a plain color without any structural details such as windows or balconies being generated. I would like to hear some suggestions on how to improve this. Thank you.

I have updated an extra hyper parameter in the latest commit. It will help the generation of some fine details when the sd+controlnet model is uncertain about the object appearance, for example, the chair sample in the readme page. You may try it out, but not very certain if it will help in your specific case.

Thank you very much for your assistance. I truly appreciate it. I also wanted to inquire if SyncMVD supports using image prompt. Additionally, do you have any recommendations or tips for utilizing image prompt effectively?