A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
we have so many format here, I don't think the above filter is general enough for all format in https://huggingface.co/models, such as hfl/chinese-bert-wwm-ext and hfl/chinese-bert-wwm both are not fit to the rule of .split('-')[0].
https://github.com/NVIDIA/NeMo/blob/6452ae3b51b969e6b778947ddaacb7c91d2780f7/nemo/collections/nlp/data/tokenizers/bert_tokenizer.py#L77
we have so many format here, I don't think the above filter is general enough for all format in https://huggingface.co/models, such as
hfl/chinese-bert-wwm-ext
andhfl/chinese-bert-wwm
both are not fit to the rule of.split('-')[0]
.