The specific version of torch and cuda?

LeonCheng0129 commented 5 months ago

i kept having the same cuda error when runing this part when computing the similarity on llama2-7b.

for batch in tqdm(dataloader, desc="Processing batches"):
    inputs = tokenizer(batch, return_tensors="pt", padding="longest", max_length=max_length, truncation=True).to(device)
    with torch.no_grad():
        outputs = model(**inputs)
    attention_mask = inputs["attention_mask"]
    hidden_states = outputs.hidden_states
    last_non_padded_hidden_states = utils.get_last_non_padded_tokens(hidden_states, attention_mask)

    # Remove the first element to account for the input layer not being considered a model hidden layer
    # This adjustment is necessary for analyses focusing on the model's internal transformations
    last_non_padded_hidden_states = last_non_padded_hidden_states[1:]

    # Ensure that the length of last_non_padded_hidden_states matches the number of model hidden layers minus one
    assert len(last_non_padded_hidden_states) == model.config.num_hidden_layers, "Length of last_non_padded_hidden_states  \
    does not match expected number of hidden layers."

    # Compute distances and append to all_distances
    distances = utils.compute_block_distances(last_non_padded_hidden_states, layers_to_skip)
    for i, distance in enumerate(distances):
        all_distances[i].append(distance)

i tested on torch1.13+cu116 and torch2.10+cu122. but both of these setting ran into cuda error. torch2.10+cu122:

CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)

torch1.13+cu116:

...
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [6,0,0], thread: [107,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
...

shamanez commented 5 months ago

Hi, I used Torch 2.2.2 and CUDA - 12.1

Can you please use Mistral and let me know. Maybe this is a problem with the llama.

LeonCheng0129 commented 5 months ago

The code works fine on Mistral, thanks!

arcee-ai / PruneMe

The specific version of torch and cuda? #1