AkariAsai / ATTEMPT

This is the oficial repository for "Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts" (EMNLP 2022)
MIT License
97 stars 5 forks source link

multi-task training #4

Closed puraminy closed 1 year ago

puraminy commented 1 year ago

Hi How can I train the model when I provide multiple datasets? It concatenates the datasets, but I know that each batch must have one task. How do you shuffle such batches?

Currently I get error on this line:

    def check_uniqueness(self, samples):
        assert len(np.unique(samples)) == 1
AkariAsai commented 1 year ago

Hi thank you so much for your interests in our work! Could you tell me a bit more about the error? What are the datasets do you use for the multi-task training and what is the configuration do you use?

I suspect it happens when the concatnated data has different data fields (e.g., MultiRC or ReCORD uses additional meta data fields). I should have written that in README, but when I did multi-task training of tasks with different format, I modified the task field to avoid an error (or I think you can simply comment out the check_uniqueness but I haven't tried that).

puraminy commented 1 year ago

Thanks for your reply.

Actually, I am working to extend your project to my case. I supposed when you check the uniqueness of task_ids in a batch, so you somehow guarantees it in multi-tasking fashion when multiple task exist and each of which has its own task id. I thought it should be implemented when you concatenate them together.

Anyways I implemented a function like interleave(train_datasets, ...) to do so, and used that in place of concatenate(train_datasets, ...)

Yet, I am not sure if it's a good solution for multi-task training. I mean to have batches of different task where each batch is entirely from one task, particularly in my prompt-tuning solution when just prompts are trained and prompts of each task could be independent of each other with some shared data. I mean some consideration probably must be done for the optimizer and scheduler...

puraminy commented 1 year ago

Anyways thanks, I resolved the problem and currently just commented it out.

By the way what does option multi_task mean? I didn't find you setting it in any configuration file.

https://github.com/AkariAsai/ATTEMPT/search?q=multi_task

AkariAsai commented 1 year ago

Thanks again for your detailed comments & interest! Yes, different mini-batch constructions can be considered. I personally think that having multiple different tasks in the same minibatch might help to learn better attention layers, but we haven't explored those different strategies.

The multi-task option indicates the multi-task training, but it's unnecessary (if you set the shared_attn option, then the code assumes the multi-task training automatically). It's from the older version and I forgot to remove the option when I was refactoring. Thanks for the heads-up!