This project is based on Segment Anything Model by Meta. The UI is based on Gradio.
Following usage is running on your computer.
pip install git+https://github.com/facebookresearch/segment-anything.git
git clone
this repository:git clone https://github.com/5663015/segment_anything_webui.git
Make a new folder named checkpoints
under this project,and put the downloaded weights files in checkpoints
。You can download the weights using following URLs:
vit_h
: ViT-H SAM model
vit_l
: ViT-L SAM model
vit_b
: ViT-B SAM model
Under checkpoints
, make a new folder named models--google--owlvit-base-patch32
, and put the downloaded OWL-ViT weights files in models--google--owlvit-base-patch32
.
Run:
python app.py
Note: Default model is vit_b
,the demo can run on CPU. Default device is cpu
。
[x] Video segmentation
[x] Add text prompt
[x] Add points prompt
[ ] Add boxes prompt
[ ] Try to combine with ControlNet and Stable Diffusion. Use SAM to generate dataset for fine-tuning ControlNet, and generate new image with SD.