Can other embedding be used instead of text-embedding-ada-002?

cosmocode / dokuwiki-plugin-aichat

Chat with a LLM about your DokuWiki contents

https://www.dokuwiki.org/plugin:aichat

GNU General Public License v2.0

13 stars 2 forks source link

Can other embedding be used instead of text-embedding-ada-002? #13

Closed macinrdw closed 5 months ago

macinrdw commented 8 months ago

Feature Description

I noticed that by default, text-embedding-ada-002 embeddings are used to calculate embeddings. I tried changing GPT35Turbo.php file to use Text-embedding-3-small but then it seems that search is not finding any similar pages.

splitbrain commented 8 months ago

I am currently working on a new release which makes it easier to switch the different models. See the refactor branch.

I only had a short test with text-embedding-3-small yet and the problem seems to be that vector distances between short questions and the document chunks seem to be much larger than in the ada model (or other embedding models I tested). I haven't investigated why, yet.

macinrdw commented 8 months ago

Cool! What other newer emebddings worked for you @splitbrain ? I just wanted to get out of Ada as it seem to be an old model, I tried Text-embedding-3-large too but saw similar effect ;/