Closed ibozkurt79 closed 2 years ago
I think finbert.base_model
is only used to construct the tokenizer here:
https://github.com/ProsusAI/finBERT/blob/44995e0c5870c4ab37a189d756550654ae87cdf0/finbert/finbert.py#L175
Which should be fine because we haven't changed the tokenizer at any step.
finbert.base_model = 'bert-base-uncased' is where the base model is defined but lm_path = project_dir/'models'/'language_model'/'finbertTRC2' and bertmodel = AutoModelForSequenceClassification.from_pretrained(lm_path,cache_dir=None, num_labels=3) are input to config file for initialization. Basically, lm_path assigned pretrained FinBert is replaced by bert-base-uncased in the subsequent code