[Bug]: Every video I generate has a shutterstock watermark?

watzon commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits of both this extension and the webui

Are you using the latest version of the extension?

[X] I have the modelscope text2video extension updated to the lastest version and I still have the issue.

What happened?

Not sure why this is happening. Followed the install instructions, have all the models, and I'm using the standard SD 1.5 model (though I have tried others). For some reason no matter what I do, everything I generate has a shutterstock watermark.

https://user-images.githubusercontent.com/4535422/226511836-106a9f38-c3bc-464d-90ff-912612128ed1.mp4

Steps to reproduce the problem

Install as normal
Use a prompt such as car driving down the freeway at night with the negatives text, watermark, copyright, blurry
Profit?

What should have happened?

Ideally the watermark wouldn't be there

WebUI and Deforum extension Commit IDs

webui commit id - a9fed7c3 txt2vid commit id - ab1c4e74

What GPU were you using for launching?

RTX 4080

On which platform are you launching the webui backend with the extension?

No response

Settings

Console logs

Restoring base VAE
Applying xformers cross attention optimization.
VAE weights loaded.
ModelScope text2video extension for auto1111 webui
Git commit: ab1c4e74 (Mon Mar 20 22:22:46 2023)
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
Starting text2video
False
DDIM sampling tensor(1): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 36/36 [00:24<00:00,  1.48it/s]
STARTING VAE ON GPU. 1 CHUNKS TO PROCESS
DECODING FRAMES
VAE FINISHED
torch.Size([30, 3, 256, 256])
output/mp4s/20230320_212917337065.mp4
text2video finished, saving frames to /home/watzon/Pictures/generated/stable-diffusion/img2img-images/text2video-modelscope/20230320212832
Got a request to stitch frames to video using FFmpeg.
Frames:
/home/watzon/Pictures/generated/stable-diffusion/img2img-images/text2video-modelscope/20230320212832/%06d.png
To Video:
/home/watzon/Pictures/generated/stable-diffusion/img2img-images/text2video-modelscope/20230320212832/vid.mp4
Stitching *video*...
Stitching *video*...
Video stitching done in 0.12 seconds!
t2v complete, result saved at /home/watzon/Pictures/generated/stable-diffusion/img2img-images/text2video-modelscope/20230320212832

sALTaccount commented 1 year ago

The watermark is unrelated to this repo. The model you choose in the top right doesn't affect generations either, as it uses the text2video model you downloaded and put in the models folder. file:///tmp/tmp7hul_mfg.mp4 Here is a video I generated with the base repo, not this plugin, and it still has the same shutterstock watermark. This is an issue of the people who trained the text2video model including lots of videos from shutterstock in their training data.

hithereai commented 1 year ago

Indeed not a bug in the repo, but instead an issue with the training data.

kabachuha commented 1 year ago

Unrelated to the extension (and we most likely won't include it as we didn't include auto-youtube downloads), but there are neural watermark removers, which you can find and download on the internet, then pass all the frames in the video through one and assembly them again

kabachuha / sd-webui-text2video