toriato / stable-diffusion-webui-wd14-tagger

Labeling extension for Automatic1111's Web UI
1.32k stars 229 forks source link

Tuple index out of range error #22

Closed trihardseven closed 1 year ago

trihardseven commented 1 year ago

Getting this error after updating to latest commit:

To create a public link, set `share=True` in `launch()`.`
Downloading Waifu Diffusion tagger model files from SmilingWolf/wd-v1-4-vit-tagger
Loading Waifu Diffusion tagger model from C:\Users\Rance\.cache\huggingface\hub\models--SmilingWolf--wd-v1-4-vit-tagger\snapshots\0b4934b152c4a15ed726ec89c6bc7e2ab41d9fbc
2022-12-06 15:35:25.984105: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-06 15:35:26.616247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 19056 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:29:00.0, compute capability: 8.6
Error completing request
Arguments: (<PIL.Image.Image image mode=RGB size=956x968 at 0x25F4373D690>, '', False, '', '[name].[output_extension]', 'ignore', False, 'wd14', 0.35, '', '', False, False, True, '0_0, (o)_(o), +_+, +_-, ._., <o>_<o>, <|>_<|>, =_=, >_<, 3_3, 6_9, >_o, @_@, ^_^, o_o, u_u, x_x, |_|, ||_||', False) {}
Traceback (most recent call last):
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\modules\call_queue.py", line 45, in f
    res = list(func(*args, **kwargs))
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\modules\call_queue.py", line 28, in f
    res = func(*args, **kwargs)
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\extensions\stable-diffusion-webui-wd14-tagger\scripts\tagger.py", line 295, in give_me_the_tags
    ratings, tags = interrogator.interrogate(image)
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\extensions\stable-diffusion-webui-wd14-tagger\tagger\interrogator.py", line 218, in interrogate
    self.load()
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\extensions\stable-diffusion-webui-wd14-tagger\tagger\interrogator.py", line 193, in load
    self.model = tf.keras.models.load_model(
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\venv\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\venv\lib\site-packages\tensorflow\python\training\py_checkpoint_reader.py", line 45, in error_translator
    raise errors_impl.OpError(None, None, error_message, errors_impl.UNKNOWN)
tensorflow.python.framework.errors_impl.OpError: file is too short to be an sstable

Traceback (most recent call last):
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 284, in run_predict
    output = await app.blocks.process_api(
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 983, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 913, in postprocess_data
    if predictions[i] is components._Keywords.FINISHED_ITERATING:
IndexError: tuple index out of range
SmilingWolf commented 1 year ago
tensorflow.python.framework.errors_impl.OpError: file is too short to be an sstable

This error seems to mean one of the model files could be truncated. Try wiping C:\Users\Rance\.cache\huggingface\hub\models--SmilingWolf--wd-v1-4-vit-tagger\snapshots\0b4934b152c4a15ed726ec89c6bc7e2ab41d9fbc and let it download the model again.

trihardseven commented 1 year ago

Did that. Same error though.

Downloading: 100%|████████████████████████████████████████████████████████████████| 3.81M/3.81M [00:02<00:00, 1.30MB/s]
Downloading: 100%|███████████████████████████████████████████████████████████████████| 328k/328k [00:00<00:00, 713kB/s]
Downloading: 100%|████████████████████████████████████████████████████████████████| 13.8k/13.8k [00:00<00:00, 13.8MB/s]
Downloading: 100%|███████████████████████████████████████████████████████████████████| 365M/365M [08:08<00:00, 749kB/s]
Downloading: 100%|███████████████████████████████████████████████████████████████████| 174k/174k [00:00<00:00, 422kB/s]
Loading Waifu Diffusion tagger model from C:\Users\Rance\.cache\huggingface\hub\models--SmilingWolf--wd-v1-4-vit-tagger\snapshots\0b4934b152c4a15ed726ec89c6bc7e2ab41d9fbc
2022-12-06 23:54:33.343392: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-06 23:54:34.046237: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 18972 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:29:00.0, compute capability: 8.6
Error completing request
Arguments: (<PIL.Image.Image image mode=RGB size=640x960 at 0x2AE384697B0>, '', False, '', '[name].[output_extension]', 'ignore', False, 'wd14', 0.35, '', '', False, False, True, '0_0, (o)_(o), +_+, +_-, ._., <o>_<o>, <|>_<|>, =_=, >_<, 3_3, 6_9, >_o, @_@, ^_^, o_o, u_u, x_x, |_|, ||_||', False) {}
Traceback (most recent call last):
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\modules\call_queue.py", line 45, in f
    res = list(func(*args, **kwargs))
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\modules\call_queue.py", line 28, in f
    res = func(*args, **kwargs)
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\extensions\stable-diffusion-webui-wd14-tagger\scripts\tagger.py", line 295, in give_me_the_tags
    ratings, tags = interrogator.interrogate(image)
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\extensions\stable-diffusion-webui-wd14-tagger\tagger\interrogator.py", line 218, in interrogate
    self.load()
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\extensions\stable-diffusion-webui-wd14-tagger\tagger\interrogator.py", line 193, in load
    self.model = tf.keras.models.load_model(
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\venv\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\venv\lib\site-packages\tensorflow\python\training\py_checkpoint_reader.py", line 45, in error_translator
    raise errors_impl.OpError(None, None, error_message, errors_impl.UNKNOWN)
tensorflow.python.framework.errors_impl.OpError: file is too short to be an sstable

Traceback (most recent call last):
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 284, in run_predict
    output = await app.blocks.process_api(
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 983, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "H:\The Games Folder\AI Image generatore\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 913, in postprocess_data
    if predictions[i] is components._Keywords.FINISHED_ITERATING:
IndexError: tuple index out of range
SmilingWolf commented 1 year ago

Could you provide the output of pip freeze from within the webui venv, list of extensions if you have any other installed, and webui commit hash?

sashasubbbb commented 1 year ago

got the same problem after last update, reinstall didn't help. don't really think it's conflicting with something, but anyway Extensions: image pip freeze: https://pastebin.com/GzWgbv6L Latest commit: hash commit 44c46f0ed395967cd3830dd481a2db759fda5b3b

lendrick commented 1 year ago

Same issue here.

Doesn't look like it's being caused by any of the extensions, because I tried disabling all of the other ones.

Commit hash: 44c46f0ed395967cd3830dd481a2db759fda5b3b Screenshot 2022-12-07 at 15-51-13 Stable Diffusion pip freeze: https://pastebin.com/7cF4ZbaN

trihardseven commented 1 year ago

I made a brand new auto installation and had the same issue. Here is what I got when first installing:

Creating venv in directory venv using python "H:\Python\python.exe"
venv "F:\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Commit hash: 44c46f0ed395967cd3830dd481a2db759fda5b3b
Installing torch and torchvision
Installing gfpgan
Installing clip
Installing open_clip
Cloning Stable Diffusion into repositories\stable-diffusion-stability-ai...
Cloning Taming Transformers into repositories\taming-transformers...
Cloning K-diffusion into repositories\k-diffusion...
Cloning CodeFormer into repositories\CodeFormer...
Cloning BLIP into repositories\BLIP...
Installing requirements for CodeFormer
Installing requirements for Web UI
Launching Web UI with arguments:
No module 'xformers'. Proceeding without it.
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Loading weights [2700c435] from F:\stable-diffusion-webui\models\Stable-diffusion\Anything-V3.0-pruned.ckpt
Applying cross attention optimization (Doggettx).
Model loaded.
Loaded a total of 0 textual inversion embeddings.
Embeddings:
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Error running install.py for extension F:\stable-diffusion-webui\extensions\stable-diffusion-webui-wd14-tagger.
Command: "F:\stable-diffusion-webui\venv\Scripts\python.exe" "F:\stable-diffusion-webui\extensions\stable-diffusion-webui-wd14-tagger\install.py"
Error code: 1
stdout: Installing requirements for wd14-tagger

stderr: Traceback (most recent call last):
  File "F:\stable-diffusion-webui\extensions\stable-diffusion-webui-wd14-tagger\install.py", line 4, in <module>
    launch.run_pip("install tensorflow", "requirements for wd14-tagger")
  File "F:\stable-diffusion-webui\launch.py", line 78, in run_pip
    return run(f'"{python}" -m pip {args} --prefer-binary{index_url_line}', desc=f"Installing {desc}", errdesc=f"Couldn't install {desc}")
  File "F:\stable-diffusion-webui\launch.py", line 49, in run
    raise RuntimeError(message)
RuntimeError: Couldn't install requirements for wd14-tagger.
Command: "F:\stable-diffusion-webui\venv\Scripts\python.exe" -m pip install tensorflow --prefer-binary
Error code: 1
stdout: Collecting tensorflow
  Using cached tensorflow-2.11.0-cp310-cp310-win_amd64.whl (1.9 kB)
Collecting tensorflow-intel==2.11.0
  Using cached tensorflow_intel-2.11.0-cp310-cp310-win_amd64.whl (266.3 MB)
Collecting libclang>=13.0.0
  Using cached libclang-14.0.6-py2.py3-none-win_amd64.whl (14.2 MB)
Requirement already satisfied: tensorboard<2.12,>=2.11 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorflow-intel==2.11.0->tensorflow) (2.11.0)
Collecting google-pasta>=0.1.1
  Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
Collecting flatbuffers>=2.0
  Downloading flatbuffers-22.12.6-py2.py3-none-any.whl (26 kB)
Collecting h5py>=2.9.0
  Using cached h5py-3.7.0-cp310-cp310-win_amd64.whl (2.6 MB)
Requirement already satisfied: absl-py>=1.0.0 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorflow-intel==2.11.0->tensorflow) (1.3.0)
Requirement already satisfied: typing-extensions>=3.6.6 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorflow-intel==2.11.0->tensorflow) (4.4.0)
Collecting wrapt>=1.11.0
  Using cached wrapt-1.14.1-cp310-cp310-win_amd64.whl (35 kB)
Collecting keras<2.12,>=2.11.0
  Using cached keras-2.11.0-py2.py3-none-any.whl (1.7 MB)
Collecting protobuf<3.20,>=3.9.2
  Using cached protobuf-3.19.6-cp310-cp310-win_amd64.whl (895 kB)
Collecting tensorflow-estimator<2.12,>=2.11.0
  Using cached tensorflow_estimator-2.11.0-py2.py3-none-any.whl (439 kB)
Collecting termcolor>=1.1.0
  Using cached termcolor-2.1.1-py3-none-any.whl (6.2 kB)
Collecting astunparse>=1.6.0
  Using cached astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Collecting tensorflow-io-gcs-filesystem>=0.23.1
  Using cached tensorflow_io_gcs_filesystem-0.28.0-cp310-cp310-win_amd64.whl (1.5 MB)
Requirement already satisfied: numpy>=1.20 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorflow-intel==2.11.0->tensorflow) (1.23.3)
Requirement already satisfied: grpcio<2.0,>=1.24.3 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorflow-intel==2.11.0->tensorflow) (1.51.1)
Collecting opt-einsum>=2.3.2
  Using cached opt_einsum-3.3.0-py3-none-any.whl (65 kB)
Requirement already satisfied: setuptools in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorflow-intel==2.11.0->tensorflow) (63.2.0)
Collecting gast<=0.4.0,>=0.2.1
  Using cached gast-0.4.0-py3-none-any.whl (9.8 kB)
Requirement already satisfied: six>=1.12.0 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorflow-intel==2.11.0->tensorflow) (1.16.0)
Requirement already satisfied: packaging in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorflow-intel==2.11.0->tensorflow) (22.0)
Requirement already satisfied: wheel<1.0,>=0.23.0 in f:\stable-diffusion-webui\venv\lib\site-packages (from astunparse>=1.6.0->tensorflow-intel==2.11.0->tensorflow) (0.38.4)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (0.4.6)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (1.8.1)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (0.6.1)
Requirement already satisfied: werkzeug>=1.0.1 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (2.2.2)
Requirement already satisfied: requests<3,>=2.21.0 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (2.25.1)
Requirement already satisfied: markdown>=2.6.8 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (3.4.1)
Requirement already satisfied: google-auth<3,>=1.6.3 in f:\stable-diffusion-webui\venv\lib\site-packages (from tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (2.15.0)
Requirement already satisfied: rsa<5,>=3.1.4 in f:\stable-diffusion-webui\venv\lib\site-packages (from google-auth<3,>=1.6.3->tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (4.9)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in f:\stable-diffusion-webui\venv\lib\site-packages (from google-auth<3,>=1.6.3->tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (5.2.0)
Requirement already satisfied: pyasn1-modules>=0.2.1 in f:\stable-diffusion-webui\venv\lib\site-packages (from google-auth<3,>=1.6.3->tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (0.2.8)
Requirement already satisfied: requests-oauthlib>=0.7.0 in f:\stable-diffusion-webui\venv\lib\site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (1.3.1)
Requirement already satisfied: certifi>=2017.4.17 in f:\stable-diffusion-webui\venv\lib\site-packages (from requests<3,>=2.21.0->tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (2022.12.7)
Requirement already satisfied: chardet<5,>=3.0.2 in f:\stable-diffusion-webui\venv\lib\site-packages (from requests<3,>=2.21.0->tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (4.0.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in f:\stable-diffusion-webui\venv\lib\site-packages (from requests<3,>=2.21.0->tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (1.26.13)
Requirement already satisfied: idna<3,>=2.5 in f:\stable-diffusion-webui\venv\lib\site-packages (from requests<3,>=2.21.0->tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (2.10)
Requirement already satisfied: MarkupSafe>=2.1.1 in f:\stable-diffusion-webui\venv\lib\site-packages (from werkzeug>=1.0.1->tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (2.1.1)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in f:\stable-diffusion-webui\venv\lib\site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (0.4.8)
Requirement already satisfied: oauthlib>=3.0.0 in f:\stable-diffusion-webui\venv\lib\site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.12,>=2.11->tensorflow-intel==2.11.0->tensorflow) (3.2.2)
Installing collected packages: libclang, flatbuffers, wrapt, termcolor, tensorflow-io-gcs-filesystem, tensorflow-estimator, protobuf, opt-einsum, keras, h5py, google-pasta, gast, astunparse, tensorflow-intel, tensorflow
  Attempting uninstall: protobuf
    Found existing installation: protobuf 3.20.0
    Uninstalling protobuf-3.20.0:
      Successfully uninstalled protobuf-3.20.0

stderr: ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied: 'F:\\stable-diffusion-webui\\venv\\Lib\\site-packages\\google\\~rotobuf\\internal\\_api_implementation.cp310-win_amd64.pyd'
Check the permissions.

[notice] A new release of pip available: 22.2.1 -> 22.3.1
[notice] To update, run: F:\stable-diffusion-webui\venv\Scripts\python.exe -m pip install --upgrade pip

Afterwards I restarted the webui and Tagger gives the same error as always. So I'm thinking it's a problem with one of the requirements for the extension, or maybe something a different extension installed on the system conflicting with it?

Any clues on how to clean up the system of all of that?

sneed-formerly-chuck commented 1 year ago

Ok, so I seemed to have found a solution to the first problem! Just DM me and I'll send you the ans-I'm just kidding. First, download the tagger model from here: https://mega.nz/file/ptA2jSSB#G4INKHQG2x2pGAVQBn-yd_U5dMgevGF8YYM9CR_R1SY

After you unpack it, you'll want the files:

Next, navigate to your C:\Users\(your username here)\.cache\huggingface\hub\models--SmilingWolf--wd-v1-4-vit-tagger\snapshots\0b4934b152c4a15ed726ec89c6bc7e2ab41d9fbc folder and replace the three corresponding files there with the ones you just grabbed, then do the same on the \variables subfolder with the other two files

Now run the webui bat, and it should work! At least it did for me, so I hope it does for you too.

toriato commented 1 year ago

I removed the cache and PIP package from my Windows and Linux environment and tested it after clean installation, but this bug could not be reproduced.

And the code itself doesn't seem to be a problem. Is there a problem with the HuggingFace?

I'm trying to find a way to fix it. If you need to use it right now, use the https://github.com/toriato/stable-diffusion-webui-wd14-tagger/commit/ee167eabe8e02e39385d02d109cec2a5ac5718fd.
also in @sneed-formerly-chuck's reply, the file in MEGA is an outdated model. But it works right now, so you can follow this method instead.

SmilingWolf commented 1 year ago

I've been trying to break my own venv, trying various versions of Tensorflow between 2.9 - 2.11 (the models were trained using TF 2.10.0), redownloading the weights from HF multiple times, etc. Yet so far, I haven't managed to reproduce the error.

Assuming something went wrong with the serialization of the model (yet most users seem to be ok?), could anybody with a broken setup try and put the saved_model.pb from this: saved_model.zip in the C:\Users\(your username here)\.cache\huggingface\hub\models--SmilingWolf--wd-v1-4-vit-tagger\snapshots\0b4934b152c4a15ed726ec89c6bc7e2ab41d9fbc folder and try again?

Mind you: all other files MUST be the ones from HuggingFace, not the ones from the MEGA zip.

a1270 commented 1 year ago

I've been trying to break my own venv, trying various versions of Tensorflow between 2.9 - 2.11 (the models were trained using TF 2.10.0), redownloading the weights from HF multiple times, etc. Yet so far, I haven't managed to reproduce the error.

Assuming something went wrong with the serialization of the model (yet most users seem to be ok?), could anybody with a broken setup try and put the saved_model.pb from this: saved_model.zip in the C:\Users\(your username here)\.cache\huggingface\hub\models--SmilingWolf--wd-v1-4-vit-tagger\snapshots\0b4934b152c4a15ed726ec89c6bc7e2ab41d9fbc folder and try again?

Mind you: all other files MUST be the ones from HuggingFace, not the ones from the MEGA zip.

Just replacing that one file didn't work but manually copying all the blob files to the correct locations and renaming them to the proper filenames did in fact work. Symlinks on windows are rather new so there may be conflicting registry settings. Or windows is just being windows.

Note that copying a file named the same as symlink seems to send it to the great beyond. It doesn't replace the symlink with the file.

SmilingWolf commented 1 year ago

That's it! @a1270 you da man!

So all the users with problems are launching a command shell as admins, HF hub is creating symlinks, and subsequent reads from TF/Keras in load_model fail to follow the symlink back to the original file. This leads to failure to load the model, failure to run predictions (honestly could have used some try/catch during model loading, my bad), returning no predictions at all, and finally to the Tuple index error that got us all reunited here.

Meanwhile, I was launching the shell as a normal user, so no symlink, because HF hub was downloading files right into the final directory as per https://huggingface.co/docs/huggingface_hub/how-to-cache, and no problems.

So the immediate fix is to NOT run the shell as Administrator. Next up is how to force TF/Keras to honor symlinks.

EDIT: narrowed it down to variables/variables.index. If only this one file is copied instead of symlinked, everything else works as expected.

SmilingWolf commented 1 year ago

Did some more digging:

So, two solutions:

Please note, Administrator or not, you're getting models on CPU anyway if you don't roll back your TF installation to v2.10.

There's also the possibility of completely forgoing TF's nonsense and switch to ONNXRuntime, but you lose access to the official DeepDanbooru models and have to rely on third party's conversions. As far as my models are concerned, ONNX versions are "officially" supported, in that I already use them for inference in other settings and will upload them together with the TF models on HF Hub. This one is up to @toriato of course.

toriato commented 1 year ago

Of course I don't mind using ONNX. However, converting DeepDanbooru to ONNX is an individual model maker's choice, and explaining how to convert it to users will be too complicated.

So I think it would be better to support ONNX additionally.

Using Torch, Tensorflow and ONNX together in one program is a bit silly, but I think this is the best.

Thank you all for your help! If there is anything that can be improved, don't hesitate to let me know!

trihardseven commented 1 year ago

Ok, so I seemed to have found a solution to the first problem! Just DM me and I'll send you the ans-I'm just kidding. First, download the tagger model from here: https://mega.nz/file/ptA2jSSB#G4INKHQG2x2pGAVQBn-yd_U5dMgevGF8YYM9CR_R1SY

After you unpack it, you'll want the files:

  • 2022_0000_0899_6549\selected_tags.csv
  • networks\ViTB16_11_03_2022_07h05m53s\keras_metadata.pb
  • networks\ViTB16_11_03_2022_07h05m53s\saved_model.pb
  • networks\ViTB16_11_03_2022_07h05m53s\variables\variables.data-00000-of-00001
  • networks\ViTB16_11_03_2022_07h05m53s\variables\variables.index

Next, navigate to your C:\Users\(your username here)\.cache\huggingface\hub\models--SmilingWolf--wd-v1-4-vit-tagger\snapshots\0b4934b152c4a15ed726ec89c6bc7e2ab41d9fbc folder and replace the three corresponding files there with the ones you just grabbed, then do the same on the \variables subfolder with the other two files

Now run the webui bat, and it should work! At least it did for me, so I hope it does for you too.

file got removed...

That's it! @a1270 you da man!

So all the users with problems are launching a command shell as admins, HF hub is creating symlinks, and subsequent reads from TF/Keras in load_model fail to follow the symlink back to the original file. This leads to failure to load the model, failure to run predictions (honestly could have used some try/catch during model loading, my bad), returning no predictions at all, and finally to the Tuple index error that got us all reunited here.

Meanwhile, I was launching the shell as a normal user, so no symlink, because HF hub was downloading files right into the final directory as per https://huggingface.co/docs/huggingface_hub/how-to-cache, and no problems.

So the immediate fix is to NOT run the shell as Administrator. Next up is how to force TF/Keras to honor symlinks.

EDIT: narrowed it down to variables/variables.index. If only this one file is copied instead of symlinked, everything else works as expected.

Where can I find this variables file?

trihardseven commented 1 year ago

Found the file and it worked! I will say I don't run the webUI as admin, so I'm not sure what caused this issue.

toriato commented 1 year ago

I modified it to use ONNX as recommended by @SmilingWolf. If it works well, I'll close the issue.

rrweller commented 1 year ago

I am still having this issue and the above linked files are no longer available

SmilingWolf commented 1 year ago

I am still having this issue and the above linked files are no longer available

Make sure you're on the latest version of the extension and post the full log from the console.

toriato commented 1 year ago

There seems to be no additional problem, so I'll close the issue.