aws-samples / aws-lex-conv-faq

Demonstration of LLM integration into a lex bot using Lambda codehooks and a Sagemaker endpoint.
MIT No Attribution
11 stars 6 forks source link

ValueError: Unsupported model type falcon #5

Open smart-patrol opened 12 months ago

smart-patrol commented 12 months ago

I am getting the following error. Will update with FLAN instead and rety.

#033[2m2023-10-06T18:32:49.087661Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Args { model_id: "tiiuae/falcon-7b-instruct", revision: None, sharded: None, num_shard: Some(1), quantize: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1000, max_total_tokens: 1512, max_batch_size: None, waiting_served_ratio: 1.2, max_batch_total_tokens: 32000, max_waiting_tokens: 20, port: 8080, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/tmp"), weights_cache_override: None, disable_custom_kernels: false, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, env: false }
--
#033[2m2023-10-06T18:32:49.087762Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Starting download process.
#033[2m2023-10-06T18:32:51.646683Z#033[0m #033[33m WARN#033[0m #033[1mdownload#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m No safetensors weights found for model tiiuae/falcon-7b-instruct at revision None. Downloading PyTorch weights.
#033[2m2023-10-06T18:32:51.722142Z#033[0m #033[32m INFO#033[0m #033[1mdownload#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Download file: pytorch_model-00001-of-00002.bin
#033[2m2023-10-06T18:33:02.752506Z#033[0m #033[32m INFO#033[0m #033[1mdownload#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Downloaded /tmp/models--tiiuae--falcon-7b-instruct/snapshots/cf4b3c42ce2fdfe24f753f0f0d179202fea59c99/pytorch_model-00001-of-00002.bin in 0:00:11.
#033[2m2023-10-06T18:33:02.752599Z#033[0m #033[32m INFO#033[0m #033[1mdownload#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Download: [1/2] -- ETA: 0:00:11
#033[2m2023-10-06T18:33:02.752840Z#033[0m #033[32m INFO#033[0m #033[1mdownload#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Download file: pytorch_model-00002-of-00002.bin
#033[2m2023-10-06T18:33:07.745517Z#033[0m #033[32m INFO#033[0m #033[1mdownload#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Downloaded /tmp/models--tiiuae--falcon-7b-instruct/snapshots/cf4b3c42ce2fdfe24f753f0f0d179202fea59c99/pytorch_model-00002-of-00002.bin in 0:00:04.
#033[2m2023-10-06T18:33:07.745585Z#033[0m #033[32m INFO#033[0m #033[1mdownload#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Download: [2/2] -- ETA: 0
#033[2m2023-10-06T18:33:07.745660Z#033[0m #033[33m WARN#033[0m #033[1mdownload#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m No safetensors weights found for model tiiuae/falcon-7b-instruct at revision None. Converting PyTorch weights to safetensors.
#033[2m2023-10-06T18:33:07.745780Z#033[0m #033[32m INFO#033[0m #033[1mdownload#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Convert /tmp/models--tiiuae--falcon-7b-instruct/snapshots/cf4b3c42ce2fdfe24f753f0f0d179202fea59c99/pytorch_model-00001-of-00002.bin to /tmp/models--tiiuae--falcon-7b-instruct/snapshots/cf4b3c42ce2fdfe24f753f0f0d179202fea59c99/model-00001-of-00002.safetensors.
#033[2m2023-10-06T18:33:19.153703Z#033[0m #033[32m INFO#033[0m #033[1mdownload#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Convert: [1/2] -- Took: 0:00:11.407687
#033[2m2023-10-06T18:33:19.153776Z#033[0m #033[32m INFO#033[0m #033[1mdownload#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Convert /tmp/models--tiiuae--falcon-7b-instruct/snapshots/cf4b3c42ce2fdfe24f753f0f0d179202fea59c99/pytorch_model-00002-of-00002.bin to /tmp/models--tiiuae--falcon-7b-instruct/snapshots/cf4b3c42ce2fdfe24f753f0f0d179202fea59c99/model-00002-of-00002.safetensors.
#033[2m2023-10-06T18:33:24.271356Z#033[0m #033[32m INFO#033[0m #033[1mdownload#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Convert: [2/2] -- Took: 0:00:05.117383
#033[2m2023-10-06T18:33:24.829104Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Successfully downloaded weights.
#033[2m2023-10-06T18:33:24.829294Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Starting shard 0
#033[2m2023-10-06T18:33:28.178304Z#033[0m #033[31mERROR#033[0m #033[1mshard-manager#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Error when initializing model
Traceback (most recent call last):  File "/opt/conda/bin/text-generation-server", line 8, in <module>    sys.exit(app())  File "/opt/conda/lib/python3.9/site-packages/typer/main.py", line 311, in __call__    return get_command(self)(*args, **kwargs)  File "/opt/conda/lib/python3.9/site-packages/click/core.py", line 1130, in __call__    return self.main(*args, **kwargs)  File "/opt/conda/lib/python3.9/site-packages/typer/core.py", line 778, in main    return _main(  File "/opt/conda/lib/python3.9/site-packages/typer/core.py", line 216, in _main    rv = self.invoke(ctx)  File "/opt/conda/lib/python3.9/site-packages/click/core.py", line 1657, in invoke    return _process_result(sub_ctx.command.invoke(sub_ctx))  File "/opt/conda/lib/python3.9/site-packages/click/core.py", line 1404, in invoke    return ctx.invoke(self.callback, **ctx.params)  File "/opt/conda/lib/python3.9/site-packages/click/core.py", line 760, in invoke    return __callback(*args, **kwargs)  File "/opt/conda/lib/python3.9/site-packages/typer/main.py", line 683, in wrapper    return callback(**use_params)  # type: ignore  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 67, in serve    server.serve(model_id, revision, sharded, quantize, trust_remote_code, uds_path)  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 155, in serve    asyncio.run(serve_inner(model_id, revision, sharded, quantize, trust_remote_code))  File "/opt/conda/lib/python3.9/asyncio/runners.py", line 44, in run    return loop.run_until_complete(main)  File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 634, in run_until_complete    self.run_forever()  File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 601, in run_forever    self._run_once()  File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 1905, in _run_once    handle._run()  File "/opt/conda/lib/python3.9/asyncio/events.py", line 80, in _run    self._context.run(self._callback, *self._args)
> File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 124, in serve_inner    model = get_model(model_id, revision, sharded, quantize, trust_remote_code)  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/__init__.py", line 314, in get_model    raise ValueError(f"Unsupported model type {model_type}")
ValueError: Unsupported model type falcon #033[2m#033[3mrank#033[0m#033[2m=#033[0m0#033[0m
#033[2m2023-10-06T18:33:28.832440Z#033[0m #033[31mERROR#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Shard 0 failed to start:
Traceback (most recent call last):  File "/opt/conda/bin/text-generation-server", line 8, in <module>    sys.exit(app())  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 67, in serve    server.serve(model_id, revision, sharded, quantize, trust_remote_code, uds_path)  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 155, in serve    asyncio.run(serve_inner(model_id, revision, sharded, quantize, trust_remote_code))  File "/opt/conda/lib/python3.9/asyncio/runners.py", line 44, in run    return loop.run_until_complete(main)  File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete    return future.result()  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 124, in serve_inner    model = get_model(model_id, revision, sharded, quantize, trust_remote_code)  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/__init__.py", line 314, in get_model    raise ValueError(f"Unsupported model type {model_type}")
 

<br class="Apple-interchange-newline">
nghtm commented 11 months ago

+1, checked cloudwatch logs and got same error trying to deploy the stack today

ValueError: Unsupported model type falcon

N0B5 commented 11 months ago

I ran into this same error today, but i was able to resolve it by 1) upgrading sagemaker to the latest version (2.196.0) 2) in endpoint_handler.py, change the llm image to version 1.0.3 llm_image = get_huggingface_llm_image_uri( "huggingface", version="1.0.3" )