sileod / tasknet

Easy multi-task learning with HuggingFace Datasets and Trainer
GNU General Public License v3.0
40 stars 3 forks source link

How to save and load a tasknet model? #3

Open thirsima opened 11 months ago

thirsima commented 11 months ago

Hi! I tried the basic 3-task example from the README file, and the training worked fine. Then I tried to save and load the model:

Saving the model worked ok:

trainer.save_model("tasknet-model")

But loading the model gives an error:

loaded = tn.Model.from_pretrained('./tasknet-model')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[16], line 1
----> 1 loaded = tn.Model.from_pretrained('./tasknet-model')

File ~/projects/keha/Tekoaly/trials/skillrecommendation-language-model/venv/lib/python3.10/site-packages/transformers/modeling_utils.py:2175, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   2173 if not isinstance(config, PretrainedConfig):
   2174     config_path = config if config is not None else pretrained_model_name_or_path
-> 2175     config, model_kwargs = cls.config_class.from_pretrained(
   2176         config_path,
   2177         cache_dir=cache_dir,
   2178         return_unused_kwargs=True,
   2179         force_download=force_download,
   2180         resume_download=resume_download,
   2181         proxies=proxies,
   2182         local_files_only=local_files_only,
   2183         use_auth_token=use_auth_token,
   2184         revision=revision,
   2185         subfolder=subfolder,
   2186         _from_auto=from_auto_class,
   2187         _from_pipeline=from_pipeline,
   2188         **kwargs,
   2189     )
   2190 else:
   2191     model_kwargs = kwargs

AttributeError: 'NoneType' object has no attribute 'from_pretrained'

I wonder what is the correct way to save and load the model?

sileod commented 11 months ago

Hi! The library uses a shared encoder + "adapters" (task embeddings + task heads, e.g. classifiers) It saves the shared encoder and the adapters

Currently, if you want to start again, you should load the saved encoder, and fill in the adapter weights one by one with a loop

The training is multi-task, but the model use is typically single task, what is your use case ?

thirsima commented 11 months ago

Thanks! I will try loading the encoder and adapters separately.

Eventually my use case will be to train a model that can do both sentence similarity and token classification, but I at the moment I am just trying to find a multi-task training module that works without problems. So far tasknet looks most promising.

I guess tasknet does not support sentence similarity at the moment, but looking at the currently supported task implementations, it should not be too hard to add.

thirsima commented 11 months ago

To clarify the use case, I eventually want to implement a microservice that loads the trained encoder and trained adapters from local files so that encoder is common for the 2 tasks.

sileod commented 11 months ago

Sentence similarity is already supported, just use tn.Classification template where y is float. So it should work off the shelf. This code show how to specialize encoder for one of the training tasks https://github.com/sileod/tasknet/blob/2d1c49e7d291b76e810a2a39c31779f1635262bb/src/tasknet/utils.py#L137

thirsima commented 11 months ago

Currently, if I call trainer.save_model(task_index) for 4 tasks, 4 different copies of the encoder are saved to disk and the files seem to have differences. And if I use load_pipeline() for all 4 tasks, I have 4 copies of the encoder in memory.

Is it possible to load the 4 tasks so that the encoder would be shared again? My aim is to avoid excessive memory consumption when I have multiple tasks that could use a shared encoder.

tasknet.Model.__init__() seems to have warm_start parameter. Would it be feasible to load first encoder from one task, and then warm start tasknet.Model with that encoder?

sileod commented 11 months ago

Currently, when the model is saved, it saves a single encoder + a set of adapters. The adapter class is actually a collection of adapters. I'll try to clarify this, thanks

Then, you can load the single encoder and set of adapters, and use model = adapter.adapt_model_to_task(model, task_name) So you should save once, then call adapt_model_to_task multiple times for each task.
If you do: model_t1 = adapter.adapt_model_to_task(model, task_name1) model_t2 = adapter.adapt_model_to_task(model, task_name2) You will have different model objects, but they will should the same weights You can have hundreds of models, if they use the same weights, it will not use much more memory than one (that's how I trained deberta-tasksource-nli on a single GPU with tasknet) The main concern is how to address task embeddings properly

cuongnguyengit commented 4 months ago

Hi, Is there a way to combine all tasks into one model in the inference step? Task_index can be made into a variable of model?

sileod commented 4 months ago

You should use task_model_list https://github.com/sileod/tasknet/blob/main/src/tasknet/models.py#L188 It's a torch module so it can uses variables as input and it's differentiable If you are talking about actual inference outside of training, it's cleaner to use the adapter as mentionned above