Load the model for inference?

DachengLi1 / LongChat

Official repository for LongChat and LongEval

Apache License 2.0

504 stars 29 forks source link

Load the model for inference? #10

Closed fahadh4ilyas closed 1 year ago

fahadh4ilyas commented 1 year ago

In the huggingface, your model is based on Llama model, but when you train the model, you have to add monkey patches. Why the monkey patches isn't in the huggingface model? Does that mean I can load the model without monkey patches?

DachengLi1 commented 1 year ago

@fahadh4ilyas Monkey patches are codes, basically it rewrites a small part of the forward() function. Models are model weights, which is irrelevant to how the forward function is executed.

You still will need the monkey patch to do inference - but you don't have to do it yourself. We are integrated in the FashChat system, and can use load_model from their API, check here.

fahadh4ilyas commented 1 year ago

Is there a reason you add monkey patch code in train script but not in eval script? Why not using fastchat loading model method to load model in train script?

DachengLi1 commented 1 year ago

@fahadh4ilyas thx for the question. As shown in the above line, in eval script we also use monkey patch, but handled by FastChat automatically.

There is no particularly reason we cannot use load_model from FastChat in training (well, to avoid confusion, we should do this in a future refactoring). It’s simply because we just integrate this into FastChat after the model was trained and didn’t modify the training script anymore. (It is the same thing tho, whether we apply monkey patch manually or use FastChat load_model)

fahadh4ilyas commented 1 year ago

Oh that makes sense. Thank you for the confirmation.