likenneth / honest_llama

Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
MIT License
461 stars 36 forks source link

Why does memory accumulate and ultimately cause overflow when running get_activations.py? #26

Closed Renpf2022 closed 6 months ago

Vicent0205 commented 10 months ago

It's better to load batch and save batch to avoid overflow.

Renpf2022 commented 10 months ago

It seems that the data that has already obtained the activation is still occupying memory and has not been properly released

Wooorry commented 6 months ago

It seems that there's a memory leak. Below are lines 65-69 in the original code:

    for prompt in tqdm(prompts):
        layer_wise_activations, head_wise_activations, _ = get_llama_activations_bau(model, prompt, device)
        all_layer_wise_activations.append(layer_wise_activations[:,-1,:])
        all_head_wise_activations.append(head_wise_activations[:,-1,:])

Which I changed into:

    for prompt in tqdm(prompts):
        layer_wise_activations, head_wise_activations, _ = get_llama_activations_bau(model, prompt, device)
        all_layer_wise_activations.append(layer_wise_activations[:,-1,:].copy())
        all_head_wise_activations.append(head_wise_activations[:,-1,:].copy())

And it works.

likenneth commented 6 months ago

@Wooorry Thank you! I just added the two copy() in branch master.