huu4ontocord / MDEL

Multi-Domain Expert Learning
Apache License 2.0
67 stars 14 forks source link

Expert merging: c-BTM #40

Open mrcabbage972 opened 1 year ago

mrcabbage972 commented 1 year ago

We would like to create a script for creating a merged model by using the C-BTM method.

The script would take as input:

List of experts models from the [MDEL HF repo](https://huggingface.co/Multi-Domain-Expert-Layers).
Name of the output model

The averaged model would be uploaded to the MDEL HF repo. It's model card should contain the names of the experts it was created from.

kenhktsui commented 1 year ago

I would also work on this too. @NourFahmy There are two steps that we could split 😃

mrcabbage972 commented 1 year ago

@NourFahmy @kenhktsui Check out Minho's adapation of the clustering step from the cBTM repo.

NourFahmy commented 1 year ago

Hi @kenhktsui - happy to take on inference and support where need be on clustering, and to fill any gaps from Minho's efforts.

I've put up a PR here

I've made the following assumptions I can easily fix:

kindly inform if anything else is needed!

cc: @mrcabbage972