mosaicml / examples

Fast and flexible reference benchmarks
Apache License 2.0
424 stars 122 forks source link

Accessing model after pre-training #401

Closed uconnectbrown closed 10 months ago

uconnectbrown commented 1 year ago

Following the instructions laid out in the README, I have conducted pre-training using the Mosaic BERT model architecture and have pointed the save_folder to a directory in an S3 bucket. It appears as though the checkpointing has resulted in a .pt file being stored as desired, but I was wondering if there was a recommended method for loading the model back into SageMaker to perform inference and to confirm that the model is performing as expected. Any recommendations or suggestions would be immensely appreciated!

dakinggg commented 12 months ago

Hi, I'm not able to help with SageMaker, but https://github.com/mosaicml/composer/blob/457717427e4d84f645e04fd801e79ab45fd26877/composer/models/huggingface.py#L541 can be used to extract a config.json and pytorch_model.bin from the .pt file. These should be able to be loaded back in by the BertForMaskedLM and BertConfig class, and should generally be compatible with the code in examples repo. If you'd like to package the model code in the checkpoint as huggingface generally does, we unfortunatley dont have a script for that for bert, but you can roughly just copy the modeling files into the same folder, and we do have a script for our llm-foundry repo that should be a good starting point if you are trying to automate this (https://github.com/mosaicml/llm-foundry/blob/main/scripts/inference/convert_composer_to_hf.py).