Why does memory accumulate and ultimately cause overflow when running get_activations.py?

Vicent0205 commented 10 months ago

It's better to load batch and save batch to avoid overflow.

Renpf2022 commented 10 months ago

It seems that the data that has already obtained the activation is still occupying memory and has not been properly released

Wooorry commented 6 months ago

It seems that there's a memory leak. Below are lines 65-69 in the original code:

    for prompt in tqdm(prompts):
        layer_wise_activations, head_wise_activations, _ = get_llama_activations_bau(model, prompt, device)
        all_layer_wise_activations.append(layer_wise_activations[:,-1,:])
        all_head_wise_activations.append(head_wise_activations[:,-1,:])

Which I changed into:

    for prompt in tqdm(prompts):
        layer_wise_activations, head_wise_activations, _ = get_llama_activations_bau(model, prompt, device)
        all_layer_wise_activations.append(layer_wise_activations[:,-1,:].copy())
        all_head_wise_activations.append(head_wise_activations[:,-1,:].copy())

And it works.

likenneth commented 6 months ago

@Wooorry Thank you! I just added the two copy() in branch master.

likenneth / honest_llama

Why does memory accumulate and ultimately cause overflow when running get_activations.py? #26