camenduru / text-generation-webui-colab

A colab gradio web UI for running Large Language Models
The Unlicense
2.07k stars 367 forks source link

falcon-7b-instruct-GPTQ-4bit.ipynb #9

Closed camenduru closed 1 year ago

camenduru commented 1 year ago
INFO:Gradio HTTP request redirected to localhost :)
WARNING:trust_remote_code is enabled. This is dangerous.
WARNING:The gradio "share link" feature uses a proprietary executable to create a reverse tunnel. Use it with care.
2023-06-06 21:55:46.220247: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
bin /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so
INFO:Loading falcon-7b-instruct-GPTQ...
INFO:The AutoGPTQ params are: {'model_basename': 'gptq_model-4bit-64g', 'device': 'cuda:0', 'use_triton': False, 'use_safetensors': True, 'trust_remote_code': True, 'max_memory': None, 'quantize_config': None}
WARNING:CUDA extension not installed.
WARNING:The safetensors archive passed at models/falcon-7b-instruct-GPTQ/gptq_model-4bit-64g.safetensors does not contain metadata. Make sure to save your model with the `save_pretrained` method. Defaulting to 'pt' metadata.
WARNING:can't get model's sequence length from model config, will set to 4096.
WARNING:RWGPTQForCausalLM hasn't fused attention module yet, will skip inject fused attention.
WARNING:RWGPTQForCausalLM hasn't fused mlp module yet, will skip inject fused mlp.
INFO:Loaded the model in 36.17 seconds.

INFO:Loading the extension "gallery"...
Running on local URL:  http://127.0.0.1:7860/
Running on public URL: https://ccd3202fc68d7be036.gradio.live/

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces
ERROR:Task exception was never retrieved
future: <Task finished name='hszag9ma4as_118' coro=<Queue.process_events() done, defined at /usr/local/lib/python3.10/dist-packages/gradio/queueing.py:343> exception=ValidationError(model='PredictBody', errors=[{'loc': ('data',), 'msg': 'field required', 'type': 'value_error.missing'}])>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 347, in process_events
    client_awake = await self.gather_event_data(event)
  File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 220, in gather_event_data
    data, client_awake = await self.get_message(event, timeout=receive_timeout)
  File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 456, in get_message
    return PredictBody(**data), True
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for PredictBody
data
  field required (type=value_error.missing)
Output generated in 15.99 seconds (0.94 tokens/s, 15 tokens, context 67, seed 1207267814)