facebookresearch / segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Apache License 2.0
46.42k stars 5.5k forks source link

Segment stable diffusion output #227

Open darshats opened 1 year ago

darshats commented 1 year ago

Hi, Regular segmentation algos dont perform so well on stable diffusion generated images. Some of the fantastic images it generates are not in training set of any of the segmentation algos. Keeping segmentation up with it will be always one step behind.

Is there a way SAM can be made part of stable diffusion itself? SD has some idea of where each object is going to show up - I'm curious if there is a way to make it also emit the segments as part of the denoising steps.

Not an issue with SAM but I'm hoping there is a way these models will keep up with generative images since no amount of train data will suffice.

HIMANSHUSINGHYANIA commented 1 year ago

!/bin/bash -e

Copyright (c) Facebook, Inc. and its affiliates.

{ black --version | grep -E "23." > /dev/null } || { echo "Linter requires 'black==23.*' !" exit 1 }

ISORT_VERSION=$(isort --version-number) if [[ "$ISORT_VERSION" != 5.12* ]]; then echo "Linter requires isort==5.12.0 !" exit 1 fi

echo "Running isort ..." isort . --atomic

echo "Running black ..." black -l 100 .

echo "Running flake8 ..." if [ -x "$(command -v flake8)" ]; then flake8 . else python3 -m flake8 . fi

echo "Running mypy..."

mypy --exclude 'setup.py|notebooks' .

chaoer commented 1 year ago

maybe you are looking for: https://github.com/sail-sg/EditAnything

darshats commented 1 year ago

Not really. Edit anything is still about segmenting regular images and then updating identified segments with diffusion. What I'm asking is segmenting diffusion output. Regular algos seem to not do so well in many SD output cases where futuristic, fantastic and unusual images are generated. Can we natively segment diffusion output during the denoising steps. Is there any work happening in that direction?