jxmorris12 / vec2text

utilities for decoding deep representations (like sentence embeddings) back to text
Other
673 stars 75 forks source link

What's happening in the example? #33

Open startakovsky opened 6 months ago

startakovsky commented 6 months ago

I didn't find the example that clear but I have a guess at what's happening. Might be worth spelling out something to the effect of trying to map text to an embedding and then right back to the original text, or perhaps a smattering of points in the pre-image of the embedding, not really sure because it's not clear from what's written.

My two cents.

jxmorris12 commented 6 months ago

@startakovsky can you be more specific? which example, and what did you think is confusing

startakovsky commented 6 months ago

Looks like the example takes in text and outputs text. What's happening?

vec2text has a function invert_strings which takes a list of strings and produces a list of strings.

The name of the function was confusing to me.

In my mind it's a misnomer if what is actually happening is:

  1. Input List of strings
  2. Produce embeddings associated to those strings
  3. Then run invert_embeddings under the hood

Maybe this is because this whole thing seems to be about:

$\mathcal{E}(strings) = embeddings

$\mathcal{E}^{-1}(embeddings) = strings

And so maybe what would be helpful is thinking about this like:

The goal of invert_strings is to find similar strings. The way we do that is we embed each input, the running our algorithm to find the inverse of the embedding, landing on a semantically similar list of strings

The goal of inverse embedding is to find strings, when embedded that produce the embeddings.

startakovsky commented 5 months ago

@jxmorris12 does that help?