agemagician / ProtTrans

ProtTrans is providing state of the art pretrained language models for proteins. ProtTrans was trained on thousands of GPUs from Summit and hundreds of Google TPUs using Transformers Models.
Academic Free License v3.0
1.13k stars 153 forks source link

Error while saving the fine-tuned model #155

Closed Chinjuj2017 closed 2 months ago

Chinjuj2017 commented 2 months ago

Hi, when I am trying to save the Prot T5 model after fine-tuning the with my data set I am getting error like this " TypeError: cannot pickle 'torch._C._distributed_c10d.ProcessGroup' object ", may I know how to resolve this. PS. I have followed your notebook on LoRA fine tuning_per_prot Thanks in advance

RSchmirler commented 2 months ago

Hi @Chinjuj2017 , without any shared code it is hard to fix. Do you run a multi GPU setup? Perhaps you can detach the params before saving, let me know if this works.

def save_model(model,filepath):
# Saves all parameters that were changed during finetuning

    # Create a dictionary to hold the non-frozen parameters
    non_frozen_params = {}

    # Iterate through all the model parameters
    for param_name, param in model.named_parameters():
        # If the parameter has requires_grad=True, add it to the dictionary
        if param.requires_grad:
            non_frozen_params[param_name] = param.detach().cpu().clone()

    # Save only the finetuned parameters 
    torch.save(non_frozen_params, filepath)
Chinjuj2017 commented 2 months ago

Hi @RSchmirler , Thank you for the response. Sorry, I didn't share any code; I have used a multi-GPU setup. I will try your solution and get back to you.