change base model effect model performance

jackie930 commented 2 years ago

hi,

i saw all tasks basiclly is loading pretrained bert `bert-base', and based on my past experience, i am wondering whether we should implement more backbones to improve the model performance ( eg: roBERTa/t5/cpt, etc).

have you already tested the result or have any ideas on this thought?

yangheng95 commented 2 years ago

I tried to replace almost all hard codes into a scalable way, i.e., BertTokenizer -> AutoTokenizer. But I did not evalute the difference, do you have and idea about what kind of implmentation is better to support more backbones?

jackie930 commented 2 years ago

I tried to replace almost all hard codes into a scalable way, i.e., BertTokenizer -> AutoTokenizer. But I did not evalute the difference, do you have and idea about what kind of implmentation is better to support more backbones?

my fault:( I had a try like below, and seems working well, so just to double check: currently the code already support different backbones via AutoTokenizer.

then may i ask what do you mean by

what kind of implmentation is better to support more backbones?


from pyabsa.functional import Trainer
from pyabsa.functional import APCConfigManager
from pyabsa.functional import ABSADatasetList
from pyabsa.functional import APCModelList

config = APCConfigManager.get_apc_config_chinese()
config.evaluate_begin = 1
config.dropout = 0.5
config.l2reg = 0.0001
config.num_epoch = 3
config.pretrained_bert = 'roberta-base'
config.model = APCModelList.FAST_LCF_BERT
# chinese_sets = 'mooc'
sent_classifier = Trainer(config=config,  # set config=None to use default model
                          dataset='./custom_apc',  # train set and test set will be automatically detected
                          auto_device=True  # automatic choose CUDA or CPU
                          )

yangheng95 commented 2 years ago

I mean if there is a better way to support multiple PTMs, apart from AutoTokenizer and AutoModel

yangheng95 / PyABSA

change base model effect model performance #113