Closed lastonehome closed 1 month ago
docid is just an ID of the document from a dataset. It is not a model output, most likely human-written from the internet
the token ID is just the index of the activated token. for example if token was 0, it would mean that the first token had a positive activation
Hi,
thanks for publishing this. Can I check I understand the output in the tables, please? The docid I’m assuming relates to a specific output from the model from a given prompt? And the token id is related to the prompt itself?
Apologies if these are simple questions. Interested in explainable AI more than the technical ins and outs.