meta-llama / llama-stack

Model components of the Llama Stack APIs
MIT License
3.91k stars 521 forks source link

What configs input when build from distributions/meta-reference-gpu/build.yaml #321

Open AlexHe99 opened 3 hours ago

AlexHe99 commented 3 hours ago

System Info

NVIDIA A30

Information

šŸ› Describe the bug

I try build by llama stack build --config distributions/meta-reference-gpu/build.yaml and do not know what should be config input for these three item of Configuring provider (remote::pgvector)

Enter value for db (required):
Enter value for user (required): 
Enter value for password (required): 

Any documentation to use distributions/meta-reference-gpu/build.yaml to build out the distribution.

Error logs

Full log as bellow

$ llama stack build --config distributions/meta-reference-gpu/build.yaml

Llama Stack is composed of several APIs working together. For each API served by the Stack,
we need to configure the providers (implementations) you want to use for these APIs.

Configuring API `inference`...
> Configuring provider `(meta-reference)`
Enter value for model (default: Llama3.1-8B-Instruct) (required):
Enter value for torch_seed (optional):
Enter value for max_seq_len (default: 4096) (required):
Enter value for max_batch_size (default: 1) (required):

Configuring API `memory`...
> Configuring provider `(meta-reference)`

> Configuring provider `(remote::chromadb)`
Enter value for host (default: localhost) (required): localhost
Enter value for port (required): 5001

> Configuring provider `(remote::pgvector)`
Enter value for host (default: localhost) (required): localhost
Enter value for port (default: 5432) (required): 5432
Enter value for db (required):
Enter value for user (required): 
Enter value for password (required): 

Expected behavior

Clear and Complete documentation to guide how to use 'distributions/meta-reference-gpu/build.yaml' to build out the distributiuon and how to test it .

AlexHe99 commented 2 hours ago

And I build out the log but run failed as the log bellow.

(LStack) opea@acc:~$ llama stack run meta-reference-gpu
Using config `/home/opea/.llama/builds/docker/meta-reference-gpu-run.yaml`
+ command -v selinuxenabled
+ mounts=
+ '[' -n '' ']'
+ '[' -n '' ']'
+ docker run -it -p 5000:5000 -v /home/opea/.llama/builds/docker/meta-reference-gpu-run.yaml:/app/config.yaml llamastack-meta-reference-gpu python -m llama_stack.distribution.server.server --yaml_config /app/config.yaml --port 5000
/opt/conda/lib/python3.10/site-packages/pydantic/_internal/_fields.py:172: UserWarning: Field name "schema" in "JsonResponseFormat" shadows an attribute in parent "BaseModel"
  warnings.warn(
Resolved 14 providers
 inner-inference => meta-reference
 inner-memory => meta-reference-00
 inner-memory => remote::chromadb-01
 inner-memory => remote::pgvector-02
 models => __routing_table__
 inference => __autorouted__
 inner-safety => meta-reference
 shields => __routing_table__
 safety => __autorouted__
 memory_banks => __routing_table__
 memory => __autorouted__
 agents => meta-reference
 telemetry => meta-reference
 inspect => __builtin__

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 346, in <module>
    fire.Fire(main)
  File "/opt/conda/lib/python3.10/site-packages/fire/core.py", line 135, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/opt/conda/lib/python3.10/site-packages/fire/core.py", line 468, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/opt/conda/lib/python3.10/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 279, in main
    impls = asyncio.run(resolve_impls(config))
  File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/opt/conda/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 185, in resolve_impls
    impl = await instantiate_provider(
  File "/opt/conda/lib/python3.10/site-packages/llama_stack/distribution/resolver.py", line 272, in instantiate_provider
    impl = await fn(*args)
  File "/opt/conda/lib/python3.10/site-packages/llama_stack/providers/impls/meta_reference/inference/__init__.py", line 16, in get_provider_impl
    from .inference import MetaReferenceInferenceImpl
  File "/opt/conda/lib/python3.10/site-packages/llama_stack/providers/impls/meta_reference/inference/inference.py", line 18, in <module>
    from .generation import Llama
  File "/opt/conda/lib/python3.10/site-packages/llama_stack/providers/impls/meta_reference/inference/generation.py", line 38, in <module>
    from lmformatenforcer import JsonSchemaParser, TokenEnforcer, TokenEnforcerTokenizerData
ModuleNotFoundError: No module named 'lmformatenforcer'
++ error_handler 57
++ echo 'Error occurred in script at line: 57'
Error occurred in script at line: 57
++ exit 1
(LStack) opea@acc:~$