Support converting the mistral-nemo huggingface weights to nemo format

dementrock commented 2 months ago

Is your feature request related to a problem? Please describe.

It does not properly process the head_dim parameter which results in weight shape mismatch.
Supporting the above also requires upgrading the transformers package to the latest version, which brings additional issues like ALBERT_PRETRAINED_MODEL_ARCHIVE_LIST not defined in transformers which was referenced in nemo/collections/nlp/modules/common/huggingface/huggingface_utils.py.
The tokenizer conversion doesn't seem to be properly handled. It looks for tokenizer.model which does not exist for this new model which uses tekken and the tokenizer file is a json.

Describe the solution you'd like

ethanhe42 commented 2 months ago

@ericharper @akoumpa for vis

akoumpa commented 2 months ago

Thanks, I'll have a fix soon.

akoumpa commented 2 months ago

berserkr commented 2 months ago

How about the other way around ? :) Nemo to HF for Mistral Instruct and Base

NVIDIA / NeMo