yafuly / MAGE

Apache License 2.0
170 stars 11 forks source link

Seperating model names and domains from source #8

Closed zaemyung closed 9 months ago

zaemyung commented 9 months ago

Hi! Thanks for releasing the code and datasets.

I am trying to separate model names and domains from the source field, but some model names/domains contain multiple underscores, making it difficult to do so. Could you use a more distinctive separator (other than '_') in between or provide a list of domains and model names?

Thanks!

yafuly commented 9 months ago

Hi,

Thanks for bringing this to our attention, we will consider another delimiter.

Here is the list of domains and models:

set_names = ['cmv', 'yelp', 'xsum', 'tldr', 'eli5', 'wp', 'roct', 'hswag', 'squad', 'sci_gen']

model_names = [
    # GLM
    'GLM130B',
    # bloom
    'bloom_7b',
    # flan_t5,
    'flan_t5_small',
    'flan_t5_base',
    'flan_t5_large',
    'flan_t5_xl',
    'flan_t5_xxl',
    # t0
    't0_3b',
    't0_11b',
    # opt,
    'opt_125m',
    'opt_350m',
    'opt_1.3b',
    'opt_2.7b',
    'opt_6.7b',
    'opt_13b',
    'opt_30b',
    'opt_iml_30b',
    'opt_iml_max_1.3b',
    # gpt
    'gpt_j',
    'gpt_neox',
    # openai
    'gpt-3.5-trubo', 
    'text-davinci-003', 
    'text-davinci-002',
    # llama
    '_7B',
    '_13B',
    '_30B',
    '_65B'
    ]

oai_list = [
    # openai
    'gpt-3.5-trubo', 
    'text-davinci-003', 
    'text-davinci-002',
]
llama_list = [
    '_7B',
    '_13B',
    '_30B',
    '_65B'
]
glm_list = [
    'GLM130B',
]
flan_list = [
    # flan_t5,
    'flan_t5_small',
    'flan_t5_base',
    'flan_t5_large',
    'flan_t5_xl',
    'flan_t5_xxl',
]

opt_list = [
    # opt,
    'opt_125m',
    'opt_350m',
    'opt_1.3b',
    'opt_2.7b',
    'opt_6.7b',
    'opt_13b',
    'opt_30b',
    'opt_iml_30b',
    'opt_iml_max_1.3b',    
]
bigscience_list = [
    'bloom_7b',
    't0_3b',
    't0_11b',
]
eleuther_list = [
    'gpt_j',
    'gpt_neox',
]
model_sets = [oai_list, llama_list, glm_list, flan_list, opt_list, bigscience_list, eleuther_list]
zaemyung commented 9 months ago

This is perfect - thanks!! 🙏