Open SoftologyPro opened 6 months ago
To get around the model issue I created a model subdirectory then git cloned the models under it
git clone https://huggingface.co/RunDiffusion/Juggernaut-XL-v8
git clone https://huggingface.co/SG161222/RealVisXL_V4.0
git clone https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
git clone https://huggingface.co/stablediffusionapi/sdxl-unstable-diffusers-y
and changed the script code to
models_dict = {
"Juggernaut":"./models/Juggernaut-XL-v8" if not use_va else "RunDiffusion/Juggernaut-XL-v8",
"RealVision":"./models/RealVisXL_V4.0" if not use_va else "SG161222/RealVisXL_V4.0" ,
"SDXL":"./models/stable-diffusion-xl-base-1.0" if not use_va else "stabilityai/stable-diffusion-xl-base-1.0" ,
"Unstable":"./models/sdxl-unstable-diffusers-y" if not use_va else "stablediffusionapi/sdxl-unstable-diffusers-y"
}
Where should I download/clone this from?
photomaker_path = "/mnt/bn/dq-storage-ckpt/zyp/magicstory_dev/photomaker-v1.bin" if use_va else "/mnt/bn/yupengdata2/projects/PhotoMaker/photomaker-v1.bin"
Also, can you change the image format from webp to png? Makes them easier to save and use with other editors.
Hi, I previously tested using local weights and forgot to remove them. I have now modified the code, so it no longer requires loading any local weights. Please run git pull to update, and thank you again for your attention. Also, Image format, do you mean the ref image in example DIR? I have changed the image format from JPEG to png
git clone https://huggingface.co/RunDiffusion/Juggernaut-XL-v8 git clone https://huggingface.co/SG161222/RealVisXL_V4.0
Also, git lfs clone
these pre-trained models is a good idea in the area of downloading these pre-trained models costs too much time, thanks for your advice!
Hi, I previously tested using local weights and forgot to remove them. I have now modified the code, so it no longer requires loading any local weights. Please run git pull to update, and thank you again for your attention. Also, Image format, do you mean the ref image in example DIR? I have changed the image format from JPEG to png
still reported no fiel named diffusion_pytorch_model.bin after git pull, actually nothing was be updated and shown already up to date... please help...
I see, just clone those 4 models and copy into corresponding folders under .cache path
and It throws a new error:
Traceback (most recent call last): File "C:\Users\namiachy\StoryDiffusion\venv\lib\site-packages\gradio\queueing.py", line 501, in call_prediction output = await route_utils.call_process_api( File "C:\Users\namiachy\StoryDiffusion\venv\lib\site-packages\gradio\route_utils.py", line 258, in call_process_api output = await app.get_blocks().process_api( File "C:\Users\namiachy\StoryDiffusion\venv\lib\site-packages\gradio\blocks.py", line 1710, in process_api result = await self.call_function( File "C:\Users\namiachy\StoryDiffusion\venv\lib\site-packages\gradio\blocks.py", line 1262, in call_function prediction = await utils.async_iteration(iterator) File "C:\Users\namiachy\StoryDiffusion\venv\lib\site-packages\gradio\utils.py", line 517, in async_iteration return await iterator.__anext__() File "C:\Users\namiachy\StoryDiffusion\venv\lib\site-packages\gradio\utils.py", line 510, in __anext__ return await anyio.to_thread.run_sync( File "C:\Users\namiachy\StoryDiffusion\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "C:\Users\namiachy\StoryDiffusion\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 2144, in run_sync_in_worker_thread return await future File "C:\Users\namiachy\StoryDiffusion\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run result = context.run(func, *args) File "C:\Users\namiachy\StoryDiffusion\venv\lib\site-packages\gradio\utils.py", line 493, in run_sync_iterator_async return next(iterator) File "C:\Users\namiachy\StoryDiffusion\venv\lib\site-packages\gradio\utils.py", line 676, in gen_wrapper response = next(iterator) File "C:\Users\namiachy\StoryDiffusion\gradio_app_sdxl_specific_id.py", line 591, in process_generation total_results = get_comic(id_images + real_images, _comic_type,captions= captions,font=ImageFont.truetype("./fonts/Inkfree.ttf", int(45))) + total_results File "C:\Users\namiachy\StoryDiffusion\utils\utils.py", line 97, in get_comic return get_comic_classical(images,captions,font,pad_image) File "C:\Users\namiachy\StoryDiffusion\utils\utils.py", line 113, in get_comic_classical pad_image = pad_image.resize(images[0].size, Image.ANTIALIAS) AttributeError: module 'PIL.Image' has no attribute 'ANTIALIAS'
please help ...
and It throws a new error: get_comic_classical pad_image = pad_image.resize(images[0].size, Image.ANTIALIAS) AttributeError: module 'PIL.Image' has no attribute 'ANTIALIAS'
please help ...
To fix that you need to roll back pillow, ie
pip uninstall -y pillow pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts pillow==9.5.0
Hi, I previously tested using local weights and forgot to remove them. I have now modified the code, so it no longer requires loading any local weights. Please run git pull to update, and thank you again for your attention. Also, Image format, do you mean the ref image in example DIR? I have changed the image format from JPEG to png
I mean the generated images in the gradio UI once the comic panbel is created. It seems to be a webp format. png would be easier.
I see, just clone those 4 models and copy into corresponding folders under .cache path
It would be easier if your script auto-downloaded the required models into the user .cache directory without them having to go and clone repositories and copy them around. That is how most other AI systems handle the models.
With the latest code I now get
OSError: Error no file named diffusion_pytorch_model.bin found in directory <cache_path>\.cache\hub\models--SG161222--RealVisXL_V4.0\snapshots\49740684ab2d8f4f5dcf6c644df2b33388a8ba85\vae.
It does auto-download the photomaker model, so that now works.
Also one more tip. The IP should be 127.0.0.1 to work when clicked on Windows and share should be False by default for security.
demo.launch(server_name="127.0.0.1", share = False if use_va else False)
I'm very confused by this project, all the advertising of video generation and I finally got it working and got the models.
I assumed the pipeline.py was doing some sort of magic to the checkpoints, but all it did was generate ipadapter / id_encoder images, what's the point in advertising all the video stuff if we can't test the video out and the online gradio demo doesn't work either? I don't understand these "legal" issue things mentioned, I just read that comment from dev.
I've made better videos with a more advanced comfyui workflow and modified Comfy-SVDTools timestep and attn window code modifications, latent blends, and SVD model merges, (a couple of the blocks merge and make it better) and loras on the SVD model which produces error but also some of the blocks seem to merge somehow in comfyui to keep a trained subject consistent, I'll be releasing workflow and code soon. I've gotten a consistent 96 frames with a lot of motion and facial expressions. I don't get this repo.
See the To Do section of the readme. Video is coming soon. I do agree all repos should not release anything until they are done and ready, but they do encounter legal or other issues sometimes outside their control which stop them releasing. At this stage this gradio generates only the comic panel images.
Yeah I saw it before I downloaded, but I assumed even a little code would at least be there in the pipeline.py, there is nothing there for video at all.
I forgot to mention for this upcoming workflow I made for SVD, I also do batch whole_image input of 48-96 random generated images, or you can use real images into into latent input on ksampler in the comfyui workflow, which I will release.
Am I supposed to be worried about getting sued also? This is what section 230 is for if you are in the US, unless you are worried about model being trained on copyright material but as openai has said "training AI models using publicly available internet materials is fair use, as supported by long-standing and widely accepted precedents". I dunno something just seems off. Sorry if I'm coming off as brusk btw.
I forgot to mention for this upcoming workflow I made for SVD, I also do batch whole_image input of 48-96 random generated images, or you can use real images into into latent input on ksampler in the comfyui workflow, which I will release.
I'd be very interested in your svg workflow for a project I'm working on (trying to reconstruct missing Dr who episodes from the 60s using old off screen photos and audio tapes) I'd love to see how your workflpw does..
@christiandarkin not sure something will be published, @brentjohnston doesn't have any repo. Copyright law is the concern of model makers, not people sharing workflows.
@christiandarkin not sure something will be published, @brentjohnston doesn't have any repo. Copyright law is the concern of model makers, not people sharing workflows.
Wall of text, but yes I will publish it and i'll post link here and banodoco discord, probably need a week or two to finalize. This is not my main account for repo.. Dont want to get sued (jk, joke lol) not a concern as this is fair use. I trained a full Onetrainer finetune on Sora woman (who doesn't exist) so will release as fan service, free.
Ps. If anyones interested, I also have a star trek tng computer I will share at some point, with the alltalk_tts finetuned model. Its cloned computer voice that sounds exactly like the next generation version of the ships computer, that you can talk with, without pressing any buttons. I would think that would be more likely something to be sued for than a text-to-video trained model like this repo, but even that I would not be making any money on and more of a fan service, but i'm also not a lawyer. https://old.reddit.com/r/Oobabooga/comments/1bj7tx4/guide_the_easiest_way_to_modify_oobabooga_colors/
That is done through text-generation-web-ui. I also have a voice clone of my voice for a company project, with llm trained on company data, sounding better than elevenlabs v2 now.
But side note on moral side of all this: because the dreamtalk dev removed their weight link this week (dumb and got me thinking) imo bad stuff that pops up people will just become desensitized to over time if its put on internet. Like those fake Microsoft calls that everyone and their grandma knows to question (took a while).. And Worst case, I'm sure ill see a naked ai video of me somewhere doing lewd things or worse and care less, and so will everyone else.. its just that initial shock factor because its so new, but the human brain adapts quickly. My hope is the population develop better critical thinking skills.
The only thing i'm really genuinely worried about is people making viruses, etc with AI though tbh, but I have no knowledge in that area though and never would want to. I would hope it is still very difficult even with readily available information.
Tldr; I dont do any of this for money btw or to stir up VC investment excitement at conferences, but because I love it and think its amazing tech, and will have more positive impact on humanity than negative, but yes finding a balance and being responsible is also important.
I forgot to mention for this upcoming workflow I made for SVD, I also do batch whole_image input of 48-96 random generated images, or you can use real images into into latent input on ksampler in the comfyui workflow, which I will release.
I'd be very interested in your svg workflow for a project I'm working on (trying to reconstruct missing Dr who episodes from the 60s using old off screen photos and audio tapes) I'd love to see how your workflpw does..
Update: still working on it and not completely happy with some aspects yet, too hard to reproduce with all the custom models and loras and modified code in various parts of comfyui. Will release when it's ready.
Original comment:
Yup i'll post it here and banodoco discord in next week or two, i'm pretty happy with the body movement and coherence, and even facial expressions, blinking, (Its no Sora though)
It will require you to download a custom merged svd model (a lot of blocks failed but still made difference), older comfyui commit for the whole_batch image latent inputs, which is a modified WAS nodes batch images node thanks to a reddit user, and modified attention_patch.py from the Comfy-SVDTools repo, comfy math. What else, ComfyUI-0246 repo for junctions to make batch images and pluck some for init images for svd_img2vid_conditioning, some custom loras that make the subject in video not fade out or break apart, perterbed guidance on multiple blocks, and Power noise ksamplers settings, nvidia align your steps for both svd and image generation (using kijaj version of ays for older comfyui commit).
it's kind of a bit of setup. I'll give exact instructions and can use comfyui manager for most. It produces very similar videos to this repo, but a bit longer in length. I'm just hoping someone can help me make it better
Anyway, back on topic to StoryDiffusion. For now the comic image creation works fine with manual model downloads. If the script could be changed to auto-download the needed models that would make it much easier for users.
HI, @SoftologyPro Thank you again for your attention to our work. could you try pip install safetensors==0.4.0
I had safetensors 0.4.3, rolling back to 0.4.0 still gives this error. It does start to download the model files, but then fails for some reason (it is way too quick) then it errors out with the cannot find file. I deleted the cache directory and tried again, same error.
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory <path to cache>\hub\models--SG161222--RealVisXL_V4.0\snapshots\49740684ab2d8f4f5dcf6c644df2b33388a8ba85\text_encoder_2.
@SoftologyPro Sorry to bother you again. I modify the code to load_safe_tensor = True. Could you git pull and try again? I do not have a Windows machine that has a GPU, if still does not work, I would borrow a computer, to test it.
@SoftologyPro Sorry to bother you again. I modify the code to load_safe_tensor = True. Could you git pull and try again? I do not have a Windows machine that has a GPU, if still does not work, I would borrow a computer, to test it.
OK, that works :) Thanks. No need to manually download the models any more. I am happy to test under Windows.
Hi, I previously tested using local weights and forgot to remove them. I have now modified the code, so it no longer requires loading any local weights. Please run git pull to update, and thank you again for your attention. Also, Image format, do you mean the ref image in example DIR? I have changed the image format from JPEG to png
Hi dude, would like to tell me the path of folder where the pre-trained models finally put in.
Cache of C disk is not enough for these big files T T
大佬救命啊,这几个大的模型最后放在哪个文件夹啊,我提前下好放进去,这样执行 python gradio_app_sdxl_specific_id.py 时就不用再重新下载了,这样可以吗?看起来最上面的那位兄弟 这些代码好像也是像这样搞?
"./models/Juggernaut-XL-v8"所以最后会是在根目录/models里面吗
救救救救救命啊
models_dict = {
"Juggernaut":"./models/Juggernaut-XL-v8" if not use_va else "RunDiffusion/Juggernaut-XL-v8",
"RealVision":"./models/RealVisXL_V4.0" if not use_va else "SG161222/RealVisXL_V4.0" ,
"SDXL":"./models/stable-diffusion-xl-base-1.0" if not use_va else "stabilityai/stable-diffusion-xl-base-1.0" ,
"Unstable":"./models/sdxl-unstable-diffusers-y" if not use_va else "stablediffusionapi/sdxl-unstable-diffusers-y"
}
Wall of text, but yes I will publish it and i'll post link here and banodoco discord, probably need a week or two to finalize.
Happy to read that. Sorry if I mistaken @brentjohnston
Hi guys,
Thanks for your interest! Running the code on windows is important. If you have further questions, please consider to join the discord for better communication: https://discord.gg/2HFUHT9p
Best regards, DQ
I see, just clone those 4 models and copy into corresponding folders under .cache path
It would be easier if your script auto-downloaded the required models into the user .cache directory without them having to go and clone repositories and copy them around. That is how most other AI systems handle the models.
With the latest code I now get
OSError: Error no file named diffusion_pytorch_model.bin found in directory <cache_path>\.cache\hub\models--SG161222--RealVisXL_V4.0\snapshots\49740684ab2d8f4f5dcf6c644df2b33388a8ba85\vae.
It does auto-download the photomaker model, so that now works.
I added
_snapshot_download(repo_id="SG161222/RealVisXLV4.0")
Below the line
photomaker_path = hf_hub_download(repo_id="TencentARC/PhotoMaker", filename="photomaker-v1.bin", repo_type="model")
But it would be good to pick up the models from the Stable Diffusion downloads I have for ConfyUI as a safetensor file.
I see, just clone those 4 models and copy into corresponding folders under .cache path
It would be easier if your script auto-downloaded the required models into the user .cache directory without them having to go and clone repositories and copy them around. That is how most other AI systems handle the models.
With the latest code I now get
OSError: Error no file named diffusion_pytorch_model.bin found in directory <cache_path>\.cache\hub\models--SG161222--RealVisXL_V4.0\snapshots\49740684ab2d8f4f5dcf6c644df2b33388a8ba85\vae.
It does auto-download the photomaker model, so that now works.
I added
_snapshot_download(repo_id="SG161222/RealVisXLV4.0")
Below the line
photomaker_path = hf_hub_download(repo_id="TencentARC/PhotoMaker", filename="photomaker-v1.bin", repo_type="model")
But it would be good to pick up the models from the Stable Diffusion downloads I have for ConfyUI as a safetensor file.
I see, just clone those 4 models and copy into corresponding folders under .cache path
It would be easier if your script auto-downloaded the required models into the user .cache directory without them having to go and clone repositories and copy them around. That is how most other AI systems handle the models.
With the latest code I now get
OSError: Error no file named diffusion_pytorch_model.bin found in directory <cache_path>\.cache\hub\models--SG161222--RealVisXL_V4.0\snapshots\49740684ab2d8f4f5dcf6c644df2b33388a8ba85\vae.
It does auto-download the photomaker model, so that now works.
Wow~There was indeed no change no matter how I tried to git pull that day. It was the same after deleting and reinstalling. And this morning I saw the change when I tried git pull again and everything is normal now. I appreciate your help and great works~!
Hi, I previously tested using local weights and forgot to remove them. I have now modified the code, so it no longer requires loading any local weights. Please run git pull to update, and thank you again for your attention. Also, Image format, do you mean the ref image in example DIR? I have changed the image format from JPEG to png
Hi dude, would like to tell me the path of folder where the pre-trained models finally put in.
Cache of C disk is not enough for these big files T T
大佬救命啊,这几个大的模型最后放在哪个文件夹啊,我提前下好放进去,这样执行 python gradio_app_sdxl_specific_id.py 时就不用再重新下载了,这样可以吗?看起来最上面的那位兄弟 这些代码好像也是像这样搞?
"./models/Juggernaut-XL-v8"所以最后会是在根目录/models里面吗
救救救救救命啊
models_dict = { "Juggernaut":"./models/Juggernaut-XL-v8" if not use_va else "RunDiffusion/Juggernaut-XL-v8", "RealVision":"./models/RealVisXL_V4.0" if not use_va else "SG161222/RealVisXL_V4.0" , "SDXL":"./models/stable-diffusion-xl-base-1.0" if not use_va else "stabilityai/stable-diffusion-xl-base-1.0" , "Unstable":"./models/sdxl-unstable-diffusers-y" if not use_va else "stablediffusionapi/sdxl-unstable-diffusers-y" }
As what Dr.SoftologyPro said you can just delete those 3 folders under the path ".cache\huggingface\hub\" : "models--stablediffusionapi--sdxl-unstable-diffusers-y","models--RunDiffusion--Juggernaut-XL-v8" and "models--SG161222--RealVisXL_V4.0", then run "git pull" under venv. Run "python gradio_app_sdxl_specific_id.py" and select a model in the webui page, any missing model will auto be downloaded. Haha~
cp C disc's .cache directory to xxxx path then set env variables HF_HOME=xxxx , you don't need to set your abs path
python gradio_app_sdxl_specific_id.py
givesValueError: The provided pretrained_model_name_or_path "/mnt/bn/yupengdata2/projects/PhotoMaker/RealVisXL_V4.0" is neither a valid local path nor a valid repo id. Please check the parameter.
which makes sense because that directory does not exist. Changing the code todoes get the models downloading when the script first starts, but then gives the error
OSError: Error no file named diffusion_pytorch_model.bin found in directory D:\.cache\hub\models--SG161222--RealVisXL_V4.0\snapshots\49740684ab2d8f4f5dcf6c644df2b33388a8ba85\vae.
How can we get the required models to download correctly? Thanks.