Add a helper function to display vectors of logits nicely

neelnanda-io commented 1 year ago

Often you want to look at vectors over the vocabulary (eg the logits at a specific position). This is >50,000 dimensions and this is hard to interpret! I want there to be nice utils to visualize a vector like this.

An MVP would be a function mapping this to a pandas dataframe, with the token index, token string value, logit, log prob and probability. Either for just the top K, or for the entire vocab.

But I expect there's many ways to make something nice here! One option is to imitate nostalgebraist's graphing style for plot_logit_lens in `transformer_utils link. This takes a layer x position x d_vocab tensor, and visualises it as a layer x position heatmap, printing the string value of the top token in each cell, and colouring by the top token value.

sheikheddy commented 1 year ago

I recommend http://circos.ca/intro/circular_approach/.

Python implementations: https://github.com/ponnhide/pyCircos or https://github.com/moshi4/pyCirclize

sheikheddy commented 1 year ago

Okay, I'm going to put down some rough thoughts:

Often you want to look at vectors over the vocabulary (eg the logits at a specific position). This is >50,000 dimensions and this is hard to interpret! I want there to be nice utils to visualize a vector like this.

A more explicit way to put it:	Encoding name	OpenAI models
`gpt2` (or `r50k_base`)	Most GPT-3 models (and GPT-2)
`p50k_base`	Code models, `text-davinci-002`, `text-davinci-003`
`cl100k_base`	`text-embedding-ada-002`

Let's start with this snippet from https://github.com/openai/tiktoken:

def gpt2():
    mergeable_ranks = data_gym_to_mergeable_bpe_ranks(
        vocab_bpe_file="https://openaipublic.blob.core.windows.net/gpt-2/encodings/main/vocab.bpe",
        encoder_json_file="https://openaipublic.blob.core.windows.net/gpt-2/encodings/main/encoder.json",
    )
    return {
        "name": "gpt2",
        "explicit_n_vocab": 50257,
        "pat_str": r"""'s|'t|'re|'ve|'m|'ll|'d| ?\p{L}+| ?\p{N}+| ?[^\s\p{L}\p{N}]+|\s+(?!\S)|\s+""",
        "mergeable_ranks": mergeable_ranks,
        "special_tokens": {"<|endoftext|>": 50256},
    }

For clarity, here are a few assumptions:

There is a static dictionary mapping token ids (int) to token values (string).
Tokens earlier in the sequence affect the likelihood of tokens later (attention)
We are interested in how simple local interactions affect complex global structure.
You can have up to 4096 tokens in the context window, each token has 50k+ choices.
We usually look at logits at the final layer but in principle can check them at any output layer.
You can normalize with softmax(logits) to get logprobs

Here's a few ideas:

If you have a really long wire, or a really long strip of film, then it will take up a lot of horizontal space, and you will only be able to see a small slice at once.
In the same way we wrap up wires into coils, and filmstrips into spools of cassette tapes, we can take our string of positions and bend them into a circle like a paperclip.
There are two main types of loops: in the first one, you touch the two ends together to make the outline of a circle. You can use the space inside to draw lines that show connections between different sections. And you can use the space outside for hierarchy.
In the second one, you wind it up really tightly, like a rope, or DNA. For these, we can imagine marking each cell with a color according to state (this makes me think about Turing Machines for some reason).
You can also animate it to unwind at a variable rate, where the speed is controlled by uncertainty, faster when probability mass is concentrated, slower when it is more spread out. (This makes me think about Fourier decompositions and spectrograms). Ideally this would autoplay, like https://gource.io/.
See this https://ourworldindata.org/technology-long-run and https://socks-studio.com/2021/11/03/constructing-knowledge-through-geometry-ramon-llulls-figures-in-ars-magna-1305/ for inspiration.
Analogy to video editing: the timeline is your 1d position vector, and you layer different effects and masks and footage together into one final rendered composite. For interpretability, you run this process in reverse: go back from a final video to the source.
Finally, interactivity would be nice to have. Bret Victor has written a lot about this from a user design point of view, e.g http://worrydream.com/MagicInk/#reducing_interaction, to make the programming a bit easier I'd recommend borrowing heavily from existing component libraries.

All of this sounds a bit overkill for a helper function, but if fully realized, I think it'd be a really neat tool.

sheikheddy commented 1 year ago

I'll try to put a prototype up this weekend

neelnanda-io commented 1 year ago

Thanks! I'll admit that those takes were too in depth for me to really get my head around them, but it sounded interesting and I would love to see a prototype

On Thu, 2 Mar 2023, 14:14 sheikheddy, @.***> wrote:

I'll try to put a prototype up this weekend

— Reply to this email directly, view it on GitHub https://github.com/neelnanda-io/TransformerLens/issues/112#issuecomment-1451934017, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASRPNKMHOVPD3QBVLYCGAQLW2CTLTANCNFSM6AAAAAATDNKYOQ . You are receiving this because you authored the thread.Message ID: @.***>

sheikheddy commented 1 year ago

Still working on this, have some links in the meantime

https://observablehq.com/@bstaats/graph-visualization-introduction https://observablehq.com/@observablehq/why-use-a-radial-data-visualization https://observablehq.com/@kerryrodden/equal-area-radial-matrix-of-lgbt-rights https://observablehq.com/@mbostock/polar-clock

sheikheddy commented 1 year ago

Seems like this would be a contribution to https://github.com/alan-cooney/CircuitsVis/blob/main/python/circuitsvis/logits.py, not TransformerLens?

neelnanda-io commented 1 year ago

Ah, yes, if you're imagining a real interactive visualisation, putting it in CircuitsVis seems more natural. It's set up to be easy to integrate Javascript code and Python.

On Tue, 7 Mar 2023 at 15:05, sheikheddy @.***> wrote:

Seems like this would be a contribution to https://github.com/alan-cooney/CircuitsVis/blob/main/python/circuitsvis/logits.py, not TransformerLens?

— Reply to this email directly, view it on GitHub https://github.com/neelnanda-io/TransformerLens/issues/112#issuecomment-1458331718, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASRPNKM76Q6E4OHCIHJN353W25FEZANCNFSM6AAAAAATDNKYOQ . You are receiving this because you authored the thread.Message ID: @.***>

jbloomAus commented 1 year ago

@sheikheddy @neelnanda-io What's the plan here? Do we need an interactive visualization or will something else do?

abdurraheemali commented 1 year ago

https://www.brendangregg.com/blog/2017-02-06/flamegraphs-vs-treemaps-vs-sunburst.html for a non-interactive visualization, flame graphs do pretty well

(I'm @sheikheddy from an alt-account)

TransformerLensOrg / TransformerLens

Add a helper function to display vectors of logits nicely #112