PygmalionAI / aphrodite-engine

PygmalionAI's large-scale inference engine
https://pygmalion.chat
GNU Affero General Public License v3.0
606 stars 78 forks source link

[Feature]: Provide configuration via env vars or a configuration file #425

Open alexandreteles opened 3 weeks ago

alexandreteles commented 3 weeks ago

šŸš€ The feature, motivation and pitch

I realized I made a simple mistake after spending 30 minutes trying to figure out why the DTYPE environment variable wasn't working in Docker. The command line option is --dtype, so I thought I was doing something wrong. However, it turns out the correct environment variable name is DATATYPE in Docker, and I learned this the hard way by not checking the entrypoint.sh file earlier.

What is the best solution then? Ideally, each command line option would have a corresponding environment variable with a consistent name. Given that users can provide conflicting configurations through environment variables and command line options, the latter should take priority. Even better, we could manage all settings in a configuration file, which could be loaded either through a --config-file command line option or a CONFIG-FILE environment variable that specifies the file path.

Below is a proposed format for the configuration file:

config:
    model: model_name_or_path
    tokenizer-mode: auto
    trust-remote-code: true
    download-dir: /path/to/large/volume
    dtype: auto
    tensor-parallel-size: 8
    kv-cache-dtype: fp8_e5m2
    chat-template: /path/to/my/template
    sampler:
        temperature: 1
        min_p: 0.1

Thank you!

Alternatives

No response

Additional context

No response