Custom embedder with custom config definition

vespa-engine / vespa

AI + Data, online. https://vespa.ai

Apache License 2.0

5.68k stars 592 forks source link

Describe the bug I am trying to set up a custom embedder that is really similar to BertBaseEmbedder. When trying to get the path of the tokenizer file, it does not return the absolute path to it but the path to directory in which it is contained.

To Reproduce In the BertBaseEmbedder class happens the same so one can try to set up an example using paths when referring to the vocabulary and the model

Expected behavior It should get the correct absolute path to the file

Screenshots The error I describe Screenshot 2022-08-16 at 10 19 12

services.xml Screenshot 2022-08-16 at 10 19 24

sentence-embedder.def Screenshot 2022-08-16 at 10 30 37

Environment (please complete the following information):

OS: macOS M1
Infrastructure: self-hosted
Version 12.5

Vespa version 8.31.22

Additional context Add any other context about the problem here.

vespa-engine / vespa

Custom embedder with custom config definition #23675