explainers-by-googlers / prompt-api

A proposal for a web API for prompting browser-provided language models
Creative Commons Attribution 4.0 International
194 stars 12 forks source link

Sampling hyperparameters are not universal among models #42

Open domenic opened 1 day ago

domenic commented 1 day ago

The explainer currently assumes that a model is best controlled by setting its temperature and top-K sampling hyperparameters.

However, these aren't universal among all models. And, various other models expose more. Others to consider are top-P, max tokens (#36), repetition penalty, presence penalty, frequency penalty, and more.

This poses a challenge for creating an interoperable API for which each browser can bring their own model.

One path here is to pick a set and require that every implementation allow control over such hyperparameters. (Possibly including no real control, e.g. a frequency penalty with max = min = 1.)

It would be especially helpful if others interested in implementing the prompt API were able to chime in with their implementation constraints.

tomayac commented 1 day ago

+1. It's worthwhile to not that also not all models allow for top-k to be specified. For example, Open AI's Chat API only lets you modify top-p (they call it top_p), but not top-k.

Similar to #41, should new parameters get added to the interface, we'd want to make sure that the (if applicable) min-* and max-* values are queryable, since there are model differences.