FrozenBurning / Text2Light

[SIGGRAPH Asia 2022] Text2Light: Zero-Shot Text-Driven HDR Panorama Generation
https://frozenburning.github.io/projects/text2light/
Other
577 stars 46 forks source link

The generated image content and text do not match #19

Closed ZZfive closed 6 months ago

ZZfive commented 1 year ago

Hey @FrozenBurning

Thanks for your awesome contribution! I am very interested in this job and have conducted some tests, and currently had some issues.


Using the provided weight file, there is a significant deviation between the image content and the text,such as:

example 1

comand: python text2light.py -rg logs/global_sampler_clip -rl logs/local_sampler --outdir ./generated_panorama --text "purple petal flower" --clip clip_emb.npy --sritmo ./logs/sritmo.pth --sr_factor 4

The images generated by two different inferences using the above command are significantly different from the text(purple petal flower): image

image

example 2

comand: python text2light.py -rg logs/global_sampler_clip -rl logs/local_sampler_outdoor --outdir ./generated_panorama --text "Elephant, Watering Hole, Baby Elephant" --clip clip_emb.npy --sritmo ./logs/sritmo.pth --sr_factor 4

image


Except for the above,with using "eval mode" in the model and $@torch.no_grad()$ in inferencing, it seems that the previous inference will still affect the next inference result, causing the content of the image generated by the current inference to be similar to the previous text, with a significant deviation from the input text content of the current inference.

And regardless of the type of local sampler used, the generated hrldr often has darker and colder tones, but the hdr is also overexposed, such as

example1

hrldr: image

hdr: image

example2

hrldr: image

hdr: image

Which parameters can be adjusted to improve the generation quality of hrldr and hdr images???


Also, please tell me how to handle the above problem。

Thanks!!!

FrozenBurning commented 1 year ago

Thanks for your interest in our work! For the first two examples, you may try resampling or using dedicated checkpoints for outdoor and indoor respectively. For the last two examples, it is exactly the expected behavior. HDR map is not suitable for directly visualizing as LDR images as it is stored in the linear range. If you do so, you will get "over exposures" as the radiance above 255 is clipped. Please use tools like Blender or OpenHDR, where you can change camera exposures.

caojiehui commented 10 months ago

Thanks for your interest in our work! For the first two examples, you may try resampling or using dedicated checkpoints for outdoor and indoor respectively. For the last two examples, it is exactly the expected behavior. HDR map is not suitable for directly visualizing as LDR images as it is stored in the linear range. If you do so, you will get "over exposures" as the radiance above 255 is clipped. Please use tools like Blender or OpenHDR, where you can change camera exposures.

As per your advice, after switching to the outdoor model, the generated results now significantly differ from the textual descriptions.

python text2light.py \
    -rg /data1/chc/T2L/model/Text2Light/global_sampler_clip \
    -rl /data1/chc/T2L/model/Text2Light/local_sampler_outdoor \
    --outdir ./generated_panorama \
    --text "purple petal flower" \
    --clip clip_emb.npy \
    --sritmo /data1/chc/T2L/model/Text2Light/sritmo.pth \
    --sr_factor 4

holistic_ purple petal flower

FrozenBurning commented 6 months ago

This text prompt is an OOD description and is not a scene-level context, of which we've discussed in the paper as limitations and future work.

Close due to inactivity. Feel free to reopen for further questions!