Insightface models not loading to memory - Called on each execution of roop

TylerBizz commented 1 year ago

Something appears to have changed over the last few days. Face swapping for images was fast but now running very slow, like a minute on low res txt2img and up 3-5mins when upscaling. Running on Mac M2 Stable Diffusion Webui 1.5. It used to be a short 5sec wait for the face swap. Did something change?

glucauze commented 1 year ago

Your version of python ? python3.11 is slower with roop.

TylerBizz commented 1 year ago

python 3.10.11

100%|██████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:08<00:00, 2.36it/s] Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}████| 20/20 [00:06<00:00, 2.80it/s] find model: /Users/tybizzz/.insightface/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0 Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /Users/tybizzz/.insightface/models/buffalo_l/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0 Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /Users/tybizzz/.insightface/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0 Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /Users/tybizzz/.insightface/models/buffalo_l/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0 Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /Users/tybizzz/.insightface/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5 set det-size: (640, 640) /Users/tybizzz/stable-diffusion-webui/venv/lib/python3.10/site-packages/insightface/utils/transform.py:68: FutureWarning: rcond parameter will change to the default of machine precision times max(M, N) where M and N are the input matrix dimensions. To use the future default and silence this warning we advise to pass rcond=None, to keep using the old, explicitly pass rcond=-1. P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4 Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /Users/tybizzz/.insightface/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0 Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /Users/tybizzz/.insightface/models/buffalo_l/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0 Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /Users/tybizzz/.insightface/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0 Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /Users/tybizzz/.insightface/models/buffalo_l/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0 Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /Users/tybizzz/.insightface/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5 set det-size: (640, 640) 2023-07-17 03:31:14,686 - roop - INFO - Restore face with CodeFormer Total progress: 100%|██████████████████████████████████████████████████████████████████████| 20/20 [01:06<00:00, 3.33s/it] Total progress: 100%|██████████████████████████████████████████████████████████████████████| 20/20 [01:06<00:00, 2.80it/s]

TylerBizz commented 1 year ago

never used to show all the info on applied providers and finding model etc in terminal. It was much much quicker before. Is there a better python version to use for roop?

glucauze commented 1 year ago

It always have :) . As you can see, the code as not been changed since last week. And i doubt that the gpu disabling would have changed anything on your side. My guess is that something as changed on your end.

python3.10 might be the best choice. Note that i am using 3.11 with little pb on linux. Roop will be slow for the first image (model loading is pretty slow).

A strange but possible explanation would be that the model is reloaded each time a roop is executed. This would mean significant latency for each image generated (instead of the first). This shouldn't happen, however, as the model is stored in memory and reused later.

TylerBizz commented 1 year ago

Thanks. Must be something on my end. The model does reload each time roop is executed for me. When adding more than one face it loads for each instance and for upscaling it reloads for every tile and every face. A simple upscale used to take a couple of minutes and now take almost an hour. Definitely not stored in memory. Image generation is about 7secs and roop is a min. for every image with a single face swap. Any ideas on how to get it to load to memory or force load to memory? Some settgin must have changed on my side or I'm missing something.

glucauze commented 1 year ago

Hum no, actually you are right : https://github.com/s0md3v/sd-webui-roop/blob/main/scripts/swapper.py#L76 is the problem. The analysis is now reloaded each time :/ This is the most time consuming part. It should be loaded once (singleton pattern).

You can change it back to the previous one or try my PR https://github.com/s0md3v/sd-webui-roop/pull/152 which is not perfect either but does not have this problem.

TylerBizz commented 1 year ago

Hum no, actually you are right : https://github.com/s0md3v/sd-webui-roop/blob/main/scripts/swapper.py#L76 is the problem. The analysis is now reloaded each time :/ This is the most time consuming part. It should be loaded once (singleton pattern).

You can change it back to the previous one or try my PR #152 which is not perfect either but does not have this problem.

Ok thanks - I'm not sure how to change back or use your PR. I would be keen on the latter - can you provide some quick guidance on how to install.

glucauze commented 1 year ago

Someone already did https://github.com/s0md3v/sd-webui-roop/pull/152#issuecomment-1636811506

Just note that if you are using the PR, you will not receive any updates from official branch unless you switch back (but you will get update from the pr which might be less stable). It is a quick and dirty fix.

TylerBizz commented 1 year ago

hmmm, I'm getting no swap and this 100%|██████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:07<00:00, 2.77it/s] Error running postprocess: /Users/tylerbizzz/old-stable-diffusion-webui/extensions/sd-webui-roop/scripts/faceswap.pyit/s] Traceback (most recent call last): File "/Users/tylerbizzz/old-stable-diffusion-webui/modules/scripts.py", line 478, in postprocess script.postprocess(p, processed, *script_args) File "/Users/tylerbizzz/old-stable-diffusion-webui/extensions/sd-webui-roop/scripts/faceswap.py", line 348, in postprocess if self.enabled : File "/Users/tylerbizzz/old-stable-diffusion-webui/extensions/sd-webui-roop/scripts/faceswap.py", line 201, in enabled return any([u.enable for u in self.units]) and not shared.state.interrupted AttributeError: 'FaceSwapScript' object has no attribute 'units'

glucauze commented 1 year ago

Are you using vladmandic ? If it is the case, i just pushed a fix for that. Try to update anyway. Old version of a1111 may lack of a before_process method, i switched back to process. This should fix it also.

TylerBizz commented 1 year ago

not on Vlad but that update did work. I like it - has some nice features and is the speed I was getting before. Only anomaly is with eyes not quit looking where they should. Will keep testing. Thank you. Now if the original could load the models into memory like this then it would be working back to how it was.

glucauze commented 1 year ago

Hum, that's the very same model and configuration underlying (except if you use upscaled inswapper which is disabled by default). I'd be really surprised if the results weren't the same. It should give the same result for the eyes. Play with the post-processing box to see if it does not comes from restore face or anything :)

TylerBizz commented 1 year ago

I did find a clean install of SD and python 3.10.6 did initially resolve the speed issue on the official branch but once I stated adding ext like controlnet then issue reappeared. So not sure exactly what is causing the issue with the speed on a Mac with the official branch but I am now using @glucauze PR with great success and no speed issues + benefit of added features. Thank you thank you. Closing this issue

s0md3v / sd-webui-roop

Insightface models not loading to memory - Called on each execution of roop #174