jalammar / ecco

Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
https://ecco.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.96k stars 167 forks source link

Error encountered when comparing activation matrices in batch generation loop #97

Open miguel-kjh opened 1 year ago

miguel-kjh commented 1 year ago

I encountered an error while using the following code to generate batch outputs for a list of input texts:

def generate_batch(input_text: List[str], max_length: int, model):
    list_of_outputs = []
    lm = ecco.from_pretrained(
        model,
        activations=True,
        verbose=False,
    )
    for text in tqdm(input_text, desc="Generating batch"):
        output = lm.generate(text, max_length=max_length)
        list_of_outputs.append(output)
    return list_of_outputs

The problem is that when comparing the activation matrices across different inputs, they appear to be identical. However, I noticed that if I move the model instantiation inside the loop, the issue is resolved:

def generate_batch(input_text: List[str], max_length: int, model):
    list_of_outputs = []
    for text in tqdm(input_text, desc="Generating batch"):
        lm = ecco.from_pretrained(
            model,
            activations=True,
            verbose=False,
        )
        output = lm.generate(text, max_length=max_length)
        list_of_outputs.append(output)
    return list_of_outputs

I'm unsure why the first approach doesn't work as expected. It seems that re-instantiating the model for each input text resolves the issue. Could you please help me understand the cause of this behavior and suggest a possible solution?

Thank you for your assistance.