sshh12 / terrain-diffusion

MIT License
17 stars 2 forks source link

how to use #2

Closed hosnasn1987 closed 9 months ago

hosnasn1987 commented 9 months ago

hi

could you please guide me to use your script for fine tuning stable diffusion inpainting with my own dataset?

thank you

sshh12 commented 9 months ago

Hey!

You'll first want to create a dataset. You can see https://github.com/sshh12/terrain-diffusion/blob/main/scripts/build_text2rgb_dataset.py for the standard huggingface dataset format.

Then follow the instructions in https://github.com/sshh12/terrain-diffusion/blob/main/scripts/train_text_to_image_lora_sd2_inpaint.py to actually train it.

Hope this helps!

hosnasn1987 commented 9 months ago

hi

sorry, can you guide me more?

i cant understand this script

how can i run it in collab notebook?

On Sat, Oct 28, 2023 at 10:54 PM Shrivu Shankar @.***> wrote:

Hey!

You'll first want to create a dataset. You can see https://github.com/sshh12/terrain-diffusion/blob/main/scripts/build_text2rgb_dataset.py for the standard huggingface dataset format.

Then follow the instructions in https://github.com/sshh12/terrain-diffusion/blob/main/scripts/train_text_to_image_lora_sd2_inpaint.py to actually train it.

Hope this helps!

— Reply to this email directly, view it on GitHub https://github.com/sshh12/terrain-diffusion/issues/2#issuecomment-1783903017, or unsubscribe https://github.com/notifications/unsubscribe-auth/BCUZXED6ENZEKYAASCNE4XLYBVLV5AVCNFSM6AAAAAA6UD4XS6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBTHEYDGMBRG4 . You are receiving this because you authored the thread.Message ID: @.***>

sshh12 commented 9 months ago

Sure, what's the dataset you are trying to use?

hosnasn1987 commented 9 months ago

really thank you

i want to define a chair for stable diffusion inpainting model

i have 20 picture of this chair from all sides

i want to replace chairs in a picture with my own chair

i use groundingdino for chair detection and use SAM model for masking the picture

now i want replace chairs in the picture with my own chair

i need to define my chair for stable diffusion inpainting model

i will be really thankful for your help

On Sun, Oct 29, 2023 at 10:42 AM Shrivu Shankar @.***> wrote:

Sure, what's the dataset you are trying to use?

— Reply to this email directly, view it on GitHub https://github.com/sshh12/terrain-diffusion/issues/2#issuecomment-1784018867, or unsubscribe https://github.com/notifications/unsubscribe-auth/BCUZXECVVDP5TN2X3E3BWNLYBX6VXAVCNFSM6AAAAAA6UD4XS6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBUGAYTQOBWG4 . You are receiving this because you authored the thread.Message ID: @.***>

hosnasn1987 commented 9 months ago

it is my dataset

On Sun, Oct 29, 2023 at 11:02 AM Hosna Solaimannezhad @.***> wrote:

really thank you

i want to define a chair for stable diffusion inpainting model

i have 20 picture of this chair from all sides

i want to replace chairs in a picture with my own chair

i use groundingdino for chair detection and use SAM model for masking the picture

now i want replace chairs in the picture with my own chair

i need to define my chair for stable diffusion inpainting model

i will be really thankful for your help

On Sun, Oct 29, 2023 at 10:42 AM Shrivu Shankar @.***> wrote:

Sure, what's the dataset you are trying to use?

— Reply to this email directly, view it on GitHub https://github.com/sshh12/terrain-diffusion/issues/2#issuecomment-1784018867, or unsubscribe https://github.com/notifications/unsubscribe-auth/BCUZXECVVDP5TN2X3E3BWNLYBX6VXAVCNFSM6AAAAAA6UD4XS6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBUGAYTQOBWG4 . You are receiving this because you authored the thread.Message ID: @.***>

sshh12 commented 9 months ago

Ah ok interesting, yeah I think this should work although I'll suggest one possible slightly easier alternative first:

  1. Take input image and mask original chair and delete from image
  2. Find a picture of your chair with a similar angle (assuming you can identify the angle of the original and have enough pictures for every angle) and paste it masked. Delete some buffer between the pasted chair and the original image. Then use a pretrained inpainting model to fill in any gaps.

As for training a new inpainting model, this should work for that use case as well.

I would start with creating the right dataset format. You should be able to just adapt this script to write out the image + meta data to a folder: https://github.com/sshh12/terrain-diffusion/blob/main/scripts/build_text2rgb_dataset.py

The core part of the script is just for each image:

img = Image.open(data["rgb_fn"])
save_fn = f"{id_:06d}.png"
img.save(os.path.join(train_dir, save_fn))

meta = dict(file_name=save_fn, text=data["caption"])
metacsv.write(f"{json.dumps(meta)}\n")

If you don't have captions for the chairs then just put "a picture of a chair" or something like that.

Training once you have that dataset formatted should be as easy as just running the command in this file https://github.com/sshh12/terrain-diffusion/blob/main/scripts/train_text_to_image_lora_sd2_inpaint.py but with a path to your dataset.

hosnasn1987 commented 9 months ago

Ah ok interesting, yeah I think this should work although I'll suggest one possible slightly easier alternative first:

  1. Take input image and mask original chair and delete from image
  2. Find a picture of your chair with a similar angle (assuming you can identify the angle of the original and have enough pictures for every angle) and paste it masked. Delete some buffer between the pasted chair and the original image. Then use a pretrained inpainting model to fill in any gaps.

As for training a new inpainting model, this should work for that use case as well.

I would start with creating the right dataset format. You should be able to just adapt this script to write out the image + meta data to a folder: https://github.com/sshh12/terrain-diffusion/blob/main/scripts/build_text2rgb_dataset.py

The core part of the script is just for each image:

img = Image.open(data["rgb_fn"])
save_fn = f"{id_:06d}.png"
img.save(os.path.join(train_dir, save_fn))

meta = dict(file_name=save_fn, text=data["caption"])
metacsv.write(f"{json.dumps(meta)}\n")

If you don't have captions for the chairs then just put "a picture of a chair" or something like that.

Training once you have that dataset formatted should be as easy as just running the command in this file https://github.com/sshh12/terrain-diffusion/blob/main/scripts/train_text_to_image_lora_sd2_inpaint.py but with a path to your dataset.

hi

thank you for your complete answer