Open sxlijin opened 3 months ago
At the moment the OpenAPI key is read from an ENV variable. Would it be possible to add support to pass this key as a header on the call to the OpenAPI RESTful endpoint.
We have a requirement where there is a proxy in from our our LLM accounts that exposes oauth authentication. So the token is only valid for 1 hour.
At the moment the OpenAPI key is read from an ENV variable. Would it be possible to add support to pass this key as a header on the call to the OpenAPI RESTful endpoint. We have a requirement where there is a proxy in from our our LLM accounts that exposes oauth authentication. So the token is only valid for 1 hour.
@k-brady-sap thats a great idea. I think likely we can just add support for our ClientRegistry and this will solve it. https://docs.boundaryml.com/docs/calling-baml/client-registry
Would that work for you?
@hellovai Yes that approach will probably work all right. Will be it ok to create a new llm client for every request? We might also need to pass additional headers that contain request specific information (e.g. tenantID) - so would end up creating a new client for each request. Would we need to delete the client after the LLM call to ensure that the client registry doesn't get full up of obsolete clients?
EDIT: Solved by installing mvn with brew install mvn
- turns out mvn
command was not in the path and I was using the openapi maven dependency that assumes it is on the PATH.
Hi @hellovai and team. I always get a message that says Error generating clients: Client generation failed
(screenshot below), any ideas on what I can try? I am running it with npx @boundaryml/baml dev --preview
I have also tried changing the version to 0.56.1
while troubleshooting it with a colleague but saw no changes.
Contents of my generators.baml
:
// This helps use auto generate libraries you can use in the language of
// your choice. You can have multiple generators if you use multiple languages.
// Just ensure that the output_dir is different for each generator.
generator target {
// Valid values: "python/pydantic", "typescript", "ruby/sorbet", "rest/openapi"
output_type "rest/openapi"
// Where the generated code will be saved (relative to baml_src/)
output_dir "../"
// The version of the BAML package you have installed (e.g. same version as your baml-py or @boundaryml/baml).
// The BAML VSCode extension version should also match this version.
version "0.55.3"
// 'baml-cli generate' will run this after generating openapi.yaml, to generate your OpenAPI client
// This command will be run from within $output_dir
on_generate "openapi-generator generate -i openapi.yaml -g java -o . --additional-properties invokerPackage=com.boundaryml.baml_client,modelPackage=com.boundaryml.baml_client.model,apiPackage=com.boundaryml.baml_client.api,java8=true && cd ../baml_client && mvn clean install"
// Valid values: "sync", "async"
default_client_mode "sync"
}
Some additional info on the version of my openapi-generator and npm:
@hellovai Yes that approach will probably work all right. Will be it ok to create a new llm client for every request? We might also need to pass additional headers that contain request specific information (e.g. tenantID) - so would end up creating a new client for each request. Would we need to delete the client after the LLM call to ensure that the client registry doesn't get full up of obsolete clients?
yep thats completely ok! and our clients allow you to pass in headers: see the headers
property here https://docs.boundaryml.com/docs/snippets/clients/providers/openai
EDIT: Solved
Glad you were able to solve this @lily-sap !
would it be possible to allow the log output of the BAML service to be logged as json instead of plain text. The Baml service will be running in a pod in kubernetes and the logs of all the pods are sent to Kibana. Logging in Json format would make them much more searchable in kibana. Thanks.
@k-brady-sap just curious, which language are you using BAML with? That's a good idea -- perhaps we add a flag to configure this.
@k-brady-sap just curious, which language are you using BAML with? That's a good idea -- perhaps we add a flag to configure this.
We're using Java. Yes a flag or env variable will work to configure this.
ok, I'll get a release out by Monday to enable a preview of this feature
@k-brady-sap can you try version 0.66.0?
You should be able to see baml logs as json using BAML_LOG_JSON=1
This is the schema we emit on each request in baml_event
in the log
struct BamlEventJson {
// Metadata
start_time: String,
num_tries: usize,
total_tries: usize,
// LLM Info
client: String,
model: String,
latency_ms: u128,
stop_reason: Option<String>,
// Content
prompt: Option<String>,
llm_reply: Option<String>,
// JSON string
request_options_json: Option<String>,
// Token Usage
tokens: Option<TokenUsage>,
// Response/Error Info
parsed_response_type: Option<String>,
parsed_response: Option<String>,
error: Option<String>,
}
#[derive(Valuable)]
struct TokenUsage {
prompt_tokens: Option<u64>,
completion_tokens: Option<u64>,
total_tokens: Option<u64>,
}
some notes:
@aaronvg Thanks for such a fast turnaround on this topic. I'll update our code to use the latest version and test it out. Cheers.
@aaronvg This is what the Json output looks like. It looks like the json object is getting serialized as part of a plain text log output..
I have set two env variables like this in docker-compose file
BAML_LOG: "INFO"
BAML_LOG_JSON: 1
FROM node:22
WORKDIR /app
COPY baml_src/ baml_src/
# If you want to pin to a specific version (which we recommend):
# RUN npm install -g @boundaryml/baml@VERSION
RUN npm install -g @boundaryml/baml@0.66.0
USER node
CMD ["baml-cli", "serve", "--preview", "--port", "2024"]
2024-11-08 15:25:28 [2024-11-08T15:25:28Z WARN baml_runtime::cli::serve] BAML-over-HTTP is a preview feature.
2024-11-08 15:25:28
2024-11-08 15:25:28 Please provide feedback and let us know if you run into any issues:
2024-11-08 15:25:28
2024-11-08 15:25:28 - join our Discord at https://docs.boundaryml.com/discord, or
2024-11-08 15:25:28 - comment on https://github.com/BoundaryML/baml/issues/892
2024-11-08 15:25:28
2024-11-08 15:25:28 We expect to stabilize this feature over the next few weeks, but we need
2024-11-08 15:25:28 your feedback to do so.
2024-11-08 15:25:28
2024-11-08 15:25:28 Thanks for trying out BAML!
2024-11-08 15:25:28
2024-11-08 15:25:28 [2024-11-08T15:25:28Z INFO baml_runtime::cli::serve] BAML-over-HTTP listening on port 2024, serving from ./baml_src
2024-11-08 15:25:28
2024-11-08 15:25:28 Tip: test that the server is up using `curl http://localhost:2024/_debug/ping`
2024-11-08 15:25:28
2024-11-08 15:25:28 (You may need to replace "localhost" with the container hostname as appropriate.)
2024-11-08 15:25:28
2024-11-08 15:26:13 [2024-11-08T15:26:13Z WARN baml_runtime::cli::serve] BAML_PASSWORD not set, skipping auth check
2024-11-08 15:26:19 [2024-11-08T15:26:19Z INFO baml_events] baml_event=BamlEventJson { start_time: "2024-11-08T15:26:13.507Z", num_tries: 1, total_tries: 1, client: "OpenAI", model: "gpt-4o-mini", latency_ms: 6177, stop_reason: "stop", prompt: "......", llm_reply: ".....", request_options_json: "{\"max_tokens\":4096}", tokens: TokenUsage { prompt_tokens: 1055, completion_tokens: 743, total_tokens: 1798 }, parsed_response_type: ".....", parsed_response: ".....", error: () }
ah I missed one configuration on our end for the release -- let me re-test and release a patch. Will also be released by Monday
I just released another patch in 0.68.0 to fix this
I just released another patch in 0.68.0 to fix this
Thanks - yes the latest version works great and the logs are displaying correctly in Kibana.
Please leave any and all feedback about your experience using baml via our OpenAPI generators on this issue!
We're actively working on this and expect that we will have to do work to address pain points that users run into (including if, say, you're having trouble installing npx and would rather we provide a universal installer). We'll be prioritizing work based on whether or not someone runs into it, and will update this issue when we do!
Re stability: we're actively working on the OpenAPI generator and may need to make backwards-breaking changes to stabilize the API.
Open questions:
nullable
, nor do we use 3.1.x'soneOf: [type, 'null']
)If we have to make a backwards-breaking change to stabilize our OpenAPI support, we'll be sure to update this issue and work with any affected users to make sure that their use case will be supported moving forward.