Open ParkSungHin opened 1 month ago
If you encounter problems when using environment.yaml, I suggest you install the key dependencies I recommend, including torch, accelerate, xformers, diffusers, and transformers. Since the diffusers library is updated and iterated quickly, and there may be obvious incompatibilities between previous and later versions, it is recommended to install it according to the version I provide. If you can provide more error information, I will be glad to give more guidance.
Thanks to this, I finished setting up the environment on Linux. Do I have to download all the files in metadata.json? I have downloaded the data, but it seems difficult to download all of them due to the errors below. ERROR: [youtube] QCJyJup0qcc: Private video. Sign in if you've been granted access to this video ERROR: [youtube] n5p24NNdycc: Video unavailable. This video has been removed by the uploader ERROR: [youtube] eo1TV_1KZsE: Video unavailable
Sorry for the late reply, I was on vacation last week. You don't necessarily need to download all the videos in metadata.json, because they may be removed due to YouTube's restrictions. The data in YouTube videos does not account for a large proportion of our StorySalon dataset, so you can focus on the data in the open-source library. You can also search for suitable YouTube videos and use the data processing pipeline we provide to expand the data further.
thank you! So I'm in the process of data processing now. Can I ask you a question because I'm curious during the process?
Did you make sure that only two pairs of the many storybook videos and vtts were extracted from extract.py ?
Can I use the yolov7.pt file used in human_ocr_mask.py as a model that recognizes real people provided by github of yolov7? Or should I fine-tune it to match the picture image of the storybook?
Thank you, and thanks to you, we've solved that problem! However, the next step, inpaint.py , shows an error like the picture below, so can you tell me how to solve it?
Since the inpainting pipeline is totally borrowed from the implementations of Stable Diffusion, we did not include this part code in our repository, you can follow our README.md to download the related code and dependencies from https://github.com/CompVis/stable-diffusion
Hello, I'm trying to build an environment with environment.yaml on Windows, and there are a lot of things that aren't running. GPU is also a 4070ti super, so I think the Pytorch and Cuda versions will be different, so how should I approach it??