Open jloveric opened 7 months ago
You can try latent view size of 128 (check the config.py file). Though it seemed that using an even higher resolution using SD1.5 might have negative impact on the image quality?
--latent_tex_size=2048 --latent_view_size=192 seems to help significantly for my use case
Interesting. Would you mind sharing the use case you are working on, and the comparisons before and after the resolution change if possible? By the way, did you face any memory issue when you go for a larger resolution?
Buildings. So far I've gotten ride of the prompts for "side", "front", "back" and use 4 cameras above (3 at 3 different angles and one straight above) and then I use 3 on the bottom. Yes, I run out of memory on a 4090 an the above parameters are basically the limits of what 24GB Vram GPU can handle. I also use a building fine tuned model from CivitAI instead of standard SD.
Buildings. So far I've gotten ride of the prompts for "side", "front", "back" and use 4 cameras above (3 at 3 different angles and one straight above) and then I use 3 on the bottom. Yes, I run out of memory on a 4090 an the above parameters are basically the limits of what 24GB Vram GPU can handle. I also use a building fine tuned model from CivitAI instead of standard SD.
@jloveric I'm also working on building texture generation. Could you provide more clues about the use of prompts for "side," "front," "back"? In addition, the model you mentioned was trained and fine-tuned from CivitAI through what process exactly? Could you provide more advice if possible? It would be great if you could share the model files. Thank you very much.
@jloveric @LIU-Yuxin I have another question. When the building model is very simple, even represented by just a simple cuboid, the generated textures for such a simple building appear too basic. The facade of the building may just have a plain color without any structural details such as windows or balconies being generated. I would like to hear some suggestions on how to improve this. Thank you.
@jloveric @LIU-Yuxin I have another question. When the building model is very simple, even represented by just a simple cuboid, the generated textures for such a simple building appear too basic. The facade of the building may just have a plain color without any structural details such as windows or balconies being generated. I would like to hear some suggestions on how to improve this. Thank you.
I have updated an extra hyper parameter in the latest commit. It will help the generation of some fine details when the sd+controlnet model is uncertain about the object appearance, for example, the chair sample in the readme page. You may try it out, but not very certain if it will help in your specific case.
@bdcms1 This is one of my forks of the repo where I have some of my changes https://github.com/LIU-Yuxin/SyncMVD/compare/main...jloveric:SyncMVD:main and you can see it looks like I used architecturerealmix_v1repair from civitAI (I may have found a version on hugging face...). Also there is one other parameter I had to switch, I believe it was "depth" vs "normal" for the controlnet. You need to use "depth" if I recall.
@jloveric @LIU-Yuxin I have another question. When the building model is very simple, even represented by just a simple cuboid, the generated textures for such a simple building appear too basic. The facade of the building may just have a plain color without any structural details such as windows or balconies being generated. I would like to hear some suggestions on how to improve this. Thank you.
I have updated an extra hyper parameter in the latest commit. It will help the generation of some fine details when the sd+controlnet model is uncertain about the object appearance, for example, the chair sample in the readme page. You may try it out, but not very certain if it will help in your specific case.
Thank you very much for your assistance. I truly appreciate it. I also wanted to inquire if SyncMVD supports using image prompt. Additionally, do you have any recommendations or tips for utilizing image prompt effectively?
Some generations seem a bit grainy, is there a way from the command line to increase the resolution? I want to see if there is anything I can do here before trying to get this to work with SDXL.