Embeddings: Expose the prompt templates found in config_sentence_transformers.json, when available

Feature request

Many embeddings models are trained with task-specific prefixes, and these prefixes are often described in a file named config_sentence_transformers.json (example - see the 'prompts' object). It would be nice if the content of that object was exposed in some way.

Motivation

Without this, there's no way for me to configure which task-specific model I'd like to use at runtime and get the performance out of that model that I need. It limits the set of models that my code can run against, as I need to go and find the task-specific prefixes out of band.

I'm currently working on a simple text classifier that uses sentence embeddings models, but I've also encountered this while working on a RAG pipeline. I'd like to be able to initialize a feature-extraction pipeline, and inspect it to determine whether I need to add a task-specific prefix to the text that I'm about to embed with the pipeline.

Alternatively, I'd like it if the task-specific prefix were applied automatically by the pipeline itself, however I imagine that this would require the user to identify the task to the pipeline on creation, which may be a more complicated change than allowing me to inspect the config.

Your contribution

If the maintainers would provide some guidance for how this feature should be implemented, I'd be happy to submit a PR.

xenova / transformers.js