Multi-GPU cannot load transformers on a single card

Describe the bug

This is a reproduction of #4193 This was never adequately fixed, or has regressed, it appears.

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

As above

Screenshot

No response

Logs

00:52:58-495385 INFO     Loading "TheBloke_TinyLlama-1.1B-1T-OpenOrca-GPTQ"
00:52:58-501386 INFO     Loading with disable_exllama=True and disable_exllamav2=False.
00:52:58-503387 INFO     TRANSFORMERS_PARAMS=
{   'low_cpu_mem_usage': True,
    'torch_dtype': torch.float16,
    'device_map': 'auto',
    'max_memory': {0: '20000MiB', 1: '0MiB', 'cpu': '15000MiB'},
    'quantization_config': GPTQConfig(quant_method=<QuantizationMethod.GPTQ: 'gptq'>)}

00:52:58-694430 ERROR    Failed to load the model.
Traceback (most recent call last):
  File "C:\MyShit\AI\oobabooga_windows\text-generation-webui2\modules\ui_model_menu.py", line 247, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(selected_model, loader)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\MyShit\AI\oobabooga_windows\text-generation-webui2\modules\models.py", line 94, in load_model
    output = load_func_map[loader](model_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\MyShit\AI\oobabooga_windows\text-generation-webui2\modules\models.py", line 256, in huggingface_loader
    model = LoaderClass.from_pretrained(path_to_model, **params)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\MyShit\AI\oobabooga_windows\text-generation-webui2\installer_files\env\Lib\site-packages\transformers\models\auto\auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\MyShit\AI\oobabooga_windows\text-generation-webui2\installer_files\env\Lib\site-packages\transformers\modeling_utils.py", line 3618, in from_pretrained
    max_memory = get_max_memory(max_memory)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\MyShit\AI\oobabooga_windows\text-generation-webui2\installer_files\env\Lib\site-packages\accelerate\utils\modeling.py", line 791, in get_max_memory
    max_memory[key] = convert_file_size_to_int(max_memory[key])
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\MyShit\AI\oobabooga_windows\text-generation-webui2\installer_files\env\Lib\site-packages\accelerate\utils\modeling.py", line 129, in convert_file_size_to_int
    raise ValueError(err_msg)
ValueError: `size` 0MiB is not in a valid format. Use an integer for bytes, or a string with an unit (like '5.0GB').

System Info

1080ti, 3090ti

oobabooga / text-generation-webui

Multi-GPU cannot load transformers on a single card #6003

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info