voidism / DoLa

Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
https://arxiv.org/abs/2309.03883
419 stars 50 forks source link

What tool do you use to get the token prediction of each layer of large language models for Figure 2? #12

Closed frankdarkluo closed 7 months ago

voidism commented 7 months ago

Hi @frankdarkluo

I simply use matplotlib to make a table for figure 2!

frankdarkluo commented 7 months ago

Thanks for the reply! But I am not asking about the drawing. I am curious how do you get the probability distribution from the middle (not output) layers? Thanks.

voidism commented 7 months ago

Just insert some code to the transformers package (modeling_llama.py and generation/utils.py) and get the predictions along the decoding steps. It makes the code ugly but it works. I didn't use any tools for that.