Open CyrusVorwald opened 10 months ago
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Hi @CyrusVorwald, thanks for opening this issue!
get_depth_map
isn't defined in the transformers library, and so it's not something we can work on. I'd suggest opening a discussion on the model page and sharing these results.
@NielsRogge Could you look into some weights being randomly initialized when loading from this checkpoint?
I think the warning being shown of some weights not being initialized happened after @younesbelkada added support for DPT-hybrid in the modeling_dpt.py
code. This hybrid version of DPT introduced some other parameters, which aren't used by the default DPT model.
@NielsRogge I'm not sure that's correct. The warning is saying that model parameters are being randomly initialized i.e. the model has those values and they're not present in the state dict being used. Moreover, the weights are for neck.fusion_stage.layers.0
, which according to git blame were layers added as part of the original model.
Gentle ping @NielsRogge
Thanks for the ping, so this has to do with the following:
None
: https://github.com/open-mmlab/mmsegmentation/blob/b040e147adfa027bbc071b624bedf0ae84dfc922/mmseg/models/decode_heads/dpt_head.py#L271TLDR this is fine, all weights are used, however the implementation could be improved to avoid having the warning of no weights being randomly initialized. Marking this as a good second issue for now.
System Info
Python 3.10.12 transformers-4.36.2
Who can help?
@stevhliu @NielsRogge
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
Anecdotally, the local scaling methodology used by get_depth_map at https://huggingface.co/diffusers/controlnet-depth-sdxl-1.0 seems to work better for models that perform better at identifying close-range depth. The global scaling methodology seems to work better for models that perform better at identifying far-range depth. I combined them below:
display(bad_close_result)
display(good_close_result)
display(good_far_result)
display(bad_far_result)
Sufficiently blurring the image prior to detecting depth also gets rid of this, ie: