https://github.com/showlab/Image2Paragraph - Githubissues

flrngel / understanding-ai

personal repository

36 stars 6 forks source link

https://github.com/showlab/Image2Paragraph #29

Open flrngel opened 1 year ago

flrngel commented 1 year ago

Summary

uses blip/blip2 to generate a simple caption
uses grit/detectron2 to generate a dense caption
uses segment anything to generate a region_semantic information
unify all above and prompt to GPT
canny the input image (which is the bullshit part) and generate the new image using StableDiffusionControlNetPipeline

Conclusion

The output prompt from this project cannot generate a similar image to the input without the canny image of the input.