nebuly-ai / optimate

A collection of libraries to optimise AI model performances
https://www.nebuly.com/
Apache License 2.0
8.37k stars 639 forks source link

[Chatllama] Support Inference for trained models. #320

Open PierpaoloSorbellini opened 1 year ago

PierpaoloSorbellini commented 1 year ago

Description

Currently to perform inference of the models generated the user needs to interact with the model generated writing a small python script accordingly to how the model is saved by library, by loading the resulting checkpoint or model saved after training.

Moreover a lot of optimization can be integrated to speed-up the inference such as:

TODO

shrinath-suresh commented 1 year ago

@PierpaoloSorbellini The inference section is tagged with WIP. Do we have any basic inference code available in chatllama to load actor_rl model and run few queries ?