thygate / stable-diffusion-webui-depthmap-script

High Resolution Depth Maps for Stable Diffusion WebUI
MIT License
1.73k stars 159 forks source link

support for marigold #385

Closed affromero closed 11 months ago

affromero commented 11 months ago
Discussion: https://github.com/thygate/stable-diffusion-webui-depthmap-script/discussions/379 Issue: https://github.com/thygate/stable-diffusion-webui-depthmap-script/issues/383 I continued the implementation and this is an example: Input Leres No Boost Marigold No Boost
Leres Boosted Marigold Boosted
affromero commented 11 months ago
Input Leres Boosted Marigold No Boost Marigold boosted
semjon00 commented 11 months ago

Hello and thank you so much for this! I will test as soon as I have time. Meanwhile I will ask for some clarifications...

semjon00 commented 11 months ago

Overall looks good. I almost did not believe that I stopped short of only 16 lines with my branch - let alone adding boost with the same 16 lines.

semjon00 commented 11 months ago

My only concern is that the depthmaps are inverted. In this script, white=near convention is used. I think the only thing to fix that would be adding Marigold (10) to line 298 in depthmap_generation.py: raw_prediction_invert = self.depth_model_type in [0, 7, 8, 9]. However, after changing, please test if it would actually work nicely, also that it will it generate sensible stereoimages, etc.

semjon00 commented 11 months ago

Also, would you be so kind to append README with Marigold mentions and an Acknowledgement? Before pushing Marigold support to main we will also need a version bump (misc.py) and a changelog (CHANGELOG.md) with one change.

graemeniedermayer commented 11 months ago

My only concern is that the depthmaps are inverted. In this script, white=near convention is used. I think the only thing to fix that would be adding Marigold (10) to line 298 in depthmap_generation.py: raw_prediction_invert = self.depth_model_type in [0, 7, 8, 9]. However, after changing, please test if it would actually work nicely, also that it will it generate sensible stereoimages, etc.

I have tested this. It works without boost.

Maybe Marigold should be git clone into "extensions/stable-diffusion-webui-depthmap-script/Marigold" rather than "repositories/Marigold" to avoid future conflicts (this is why midas is installed there). Also I believe diffusers is now a requirement.

semjon00 commented 11 months ago

@graemeniedermayer Indeed, it needs diffusers>=0.20.1, should be added to install.py. Thankfully, it seems to be a rather lightweight dependency. About conflits - I'm not sure about it... maybe?

graemeniedermayer commented 11 months ago

@graemeniedermayer Indeed, it needs diffusers>=0.20.1, should be added to install.py. Thankfully, it seems to be a rather lightweight dependency. About conflits - I'm not sure about it... maybe?

Also marigold standalone mode doesn't function with the current imports.

semjon00 commented 11 months ago

Oh, supporting standalone is definitely needed - a change to requirements.txt.

affromero commented 11 months ago

Hey folks, I added some of your suggested changes. I am not entirely sure if there was a protocol for the requirements/install files, so let me know what should I change. In particular, I use this project as standalone, so the install.py I am not sure about it.

graemeniedermayer commented 11 months ago

I think this should be merged onto the marigold branch. There might be some small changes/extra testing to do before merging into main.

Great work!

semjon00 commented 11 months ago

Merged and did some fixes. Now it would be awesome to support automatic repository pull for standalone. Yes, we can just copy all the files into the root again... But I'd rather not have any new code linked this way.

aulerius commented 11 months ago

This is really good news!

I also want do draw some attention to equivalent implementation in ComfyUI, and that it implements floating point EXR export:

I added a remap node to see the full range better, and OpenEXR node to save the full range, works wonders compared to default png when used in VFX/3D modeling software.

Which relates to #372 and #370 Could it be a nice opportunity to try that as well?

semjon00 commented 11 months ago

Not now, I am reeeaaally busy. Should be doing my homework rn, in fact.

semjon00 commented 11 months ago

@affromero @graemeniedermayer I can't get it to work! Please help... :( I get things like WARNING:xformers:WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.1.1+cu121 with CUDA 1201 (you have 2.1.1+cpu) Python 3.10.11 (you have 3.10.6) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers) Memory-efficient attention, SwiGLU, sparse and more won't be available. Set XFORMERS_MORE_DETAILS=1 for more details For whatever reason trying to have XFORMERS automatically installs the wrong torch version

GPU: 1080Ti OS: Windows 10

graemeniedermayer commented 11 months ago

Oh I just realized I tested it with a different install.py file maybe removing all the new requirements besides diffusers would be best to avoid conflicts. The others are very likely to cause conflicts with other repos. And I think the others are required by a1111

graemeniedermayer commented 11 months ago

I also want do draw some attention to equivalent implementation in ComfyUI, and that it implements floating point EXR export:

It shouldn't be too challenging to save the numpy arrays is there a library for converting to EXR?

aulerius commented 11 months ago

I also want do draw some attention to equivalent implementation in ComfyUI, and that it implements floating point EXR export:

It shouldn't be too challenging to save the numpy arrays is there a library for converting to EXR?

Apparently so, simply called "OpenEXR". This is how it's done in the aforementioned ComfyUI implementation. And TIF files also support floating point 32bit precision.