Haidra-Org / AI-Horde-Worker

This repo turns your PC into a AI Horde worker node
GNU Affero General Public License v3.0
247 stars 70 forks source link

ViT-L/14 jobs failing due to TypeError #93

Closed JodanJodan closed 1 year ago

JodanJodan commented 1 year ago

Worker is getting forced into maintenance mode due to repeated ViT-L/14 failures.

trace.log :

ERROR      | 2023-03-09 18:27:31.384936 | worker.jobs.stable_diffusion:start_job:353 - Something went wrong when processing request. Please check your trace.log file for the full stack trace. Payload: {'prompt': 'Cortana, full body, face, Big ass, Big breast, anus', 'height': 512, 'width': 512, 'seed': '282081520', 'tiling': False, 'n_iter': 1, 'batch_size': 1, 'save_individual_images': False, 'save_grid': False, 'ddim_steps': 30, 'sampler_name': 'k_euler_karras', 'cfg_scale': 7.0, 'hires_fix': False, 'request_type': 'txt2img', 'model': 'ViT-L/14'}
TRACE      | 2023-03-09 18:27:31.387937 | worker.jobs.stable_diffusion:start_job:359 - Traceback (most recent call last):
  File "\AI-Horde-Worker\worker\jobs\stable_diffusion.py", line 347, in start_job
    generator.generate(**gen_payload)
  File "\AI-Horde-Worker\conda\envs\windows\lib\site-packages\nataili\util\cast.py", line 29, in wrap
    return amp.autocast(device_type="cuda", dtype=dtype)(no_grad()(func))(*args, **kwargs)
  File "\AI-Horde-Worker\conda\envs\windows\lib\site-packages\torch\amp\autocast_mode.py", line 14, in decorate_autocast
    return func(*args, **kwargs)
  File "\AI-Horde-Worker\conda\envs\windows\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "\AI-Horde-Worker\conda\envs\windows\lib\site-packages\nataili\stable_diffusion\compvis.py", line 204, in generate
    with model_context as model:
  File "\AI-Horde-Worker\conda\envs\windows\lib\contextlib.py", line 135, in __enter__
    return next(self.gen)
  File "\AI-Horde-Worker\conda\envs\windows\lib\site-packages\nataili\util\voodoo.py", line 109, in load_from_plasma
    with open(ref, "rb") as cache:
TypeError: expected str, bytes or os.PathLike object, not CLIP
db0 commented 1 year ago

How is vit4 being selected for generation? Are you loading inpainting perhaps?

jug-dev commented 1 year ago

I have tried to reproduce this problem, including submitting txt2img jobs with the model erroneously set as "stable_diffusion_inpainting". But with the latest code and environment as at Friday 10th March, everything seems ok.

Could you share your bridgeData.yaml ? And ensure you have done git pull and .\update-runtime.cmd

JodanJodan commented 1 year ago

Bridge was already running w/ f4b0b52; haven't encountered this issue since, so could be resolved. Bridge logs from the same time:


WARNING    | 2023-03-09 18:27:31.379645 | worker.jobs.stable_diffusion:start_job:182 - Model stable_diffusion_inpainting chosen for txt2img or img2img gen, switching to ViT-L/14 instead.
DEBUG      | 2023-03-09 18:27:31.381649 | worker.jobs.stable_diffusion:start_job:225 - txt2img (ViT-L/14) request with id 1ac78671-b335-44d6-9699-0ac1ff597986 picked up. Initiating work...
INFO       | 2023-03-09 18:27:31.381649 | worker.jobs.stable_diffusion:start_job:339 - Starting generation: ViT-L/14 @ 512x512 for 30 steps. Prompt length is 51 characters And it appears to contain 0 weights
ERROR      | 2023-03-09 18:27:31.384936 | worker.jobs.stable_diffusion:start_job:353 - Something went wrong when processing request. Please check your trace.log file for the full stack trace. Payload: {'prompt': 'Cortana, full body, face, Big ass, Big breast, anus', 'height': 512, 'width': 512, 'seed': '282081520', 'tiling': False, 'n_iter': 1, 'batch_size': 1, 'save_individual_images': False, 'save_grid': False, 'ddim_steps': 30, 'sampler_name': 'k_euler_karras', 'cfg_scale': 7.0, 'hires_fix': False, 'request_type': 'txt2img', 'model': 'ViT-L/14'}```
jug-dev commented 1 year ago

I'm closing this as an anomaly during various horde/worker updates. Please reopen if you see this again.