edubruell / tidyllm

An tidy interface to large language model APIs for R
https://edubruell.github.io/tidyllm/
Other
36 stars 2 forks source link

Article on using Embeddings #29

Closed edubruell closed 1 week ago

edubruell commented 2 weeks ago

Add an article introducing embeddings and demonstrating how to retrieve similar abstracts from a list of embedded economics paper abstracts using tidyllm.

Proposed Content:

  1. Intro to Embeddings:

    • Define embeddings and their purpose in capturing semantic relationships.
    • Mention common use cases like similarity search, clustering, or outlier detection in document corpora
  2. Practical Example:

    • Embed economics paper abstracts using ollama_embedding() or the other embedding API functions.
    • Calculate cosine similarity to find abstracts similar to a target abstract.
    • Provide code to rank and retrieve the most similar abstracts.
  3. Optional:

    • Visualize embeddings with t-SNE.
    • Discuss similarity score interpretation.