Closed Wauplin closed 2 months ago
@tomaarsen (or @osanseviero since I know you're working on feature-extraction lately) could I get a review on this PR please? :pray:
The import script is not so important to review. Better to focus on ./inference.ts
, ./specs/input.json
and ./specs/output.json
to check feature-extraction parameters.
Thanks for the reviews! Most comment are about TEI (which was expected^^). I addressed/reply where I can. Is there any blockers before merging this?
This PR adds a script to import
feature-extraction
inference types from text-embeddings-inference. The jsonschema is pulled from https://huggingface.github.io/text-embeddings-inference/openapi.json and converted into the JSONSchema format from which we generate types from the JS and Python clients. This script is highly inspired on the TGI importer script.This PR also add
prompt_name
input parameter that has been newly added to TEI (see https://github.com/huggingface/text-embeddings-inference/pull/312).Decisions taken:
string
as input. In theory TEI is capable of handling much more complex inputs (Union[List[Union[List[int], int, str]], str]
) but let's keep it simple for now. Other inference tasks are also currently defined without arrays even when InferenceAPI/Endpoints is capable of it./embed
route, which is the closest one tofeature-extraction
task.Note: in a follow-up PR it would be really nice to put this in a CI workflow that could be triggered manually to open a PR when new arguments are added to TGI / TEI.