Triton Configuration
Makes triton setup configurable with a yaml config. Currently, wysiwyg. If you do not provide a config, then default setup will be used. If you do provide a config, either by placing config.yaml in the ./src/ or setting an env COG_TRITON_CONFIG=./src/myconfig.yaml, it will be ingested and used to generate triton run-time configs.
Note, there are no protections right now! We should add validation and protections eventually.
Random Seeds
We now expose seed in the predict signature. If not specified, we sample a seed and use that. We also log the seed along with a note that it will not impact generation if greedy decoding is used.
Make pad and end ID configurable
We had hardcoded to 2 for llama, now it can be configured by an env.
This PR:
Triton Configuration Makes triton setup configurable with a yaml config. Currently, wysiwyg. If you do not provide a config, then default setup will be used. If you do provide a config, either by placing
config.yaml
in the./src/
or setting an envCOG_TRITON_CONFIG=./src/myconfig.yaml
, it will be ingested and used to generate triton run-time configs.Note, there are no protections right now! We should add validation and protections eventually.
Random Seeds We now expose
seed
in the predict signature. If not specified, we sample a seed and use that. We also log the seed along with a note that it will not impact generation if greedy decoding is used.Make pad and end ID configurable We had hardcoded to 2 for llama, now it can be configured by an env.