Open MengqingCao opened 1 month ago
Good day! @winglian I tried to create a class ModelKwargs
, but with the modification of model_kwargs
, there are many other operations such as patching, creating models, etc. And their judgment conditions seem inseparable.
Thus, finnaly I refactor the whole load_model
func into a class ModelLoader
. All the operations in original load_model
func have been placed in several member functions and followed the original logical order.
This brings a lot changes, while making the model loading pipeline more clearly. Moreover, the changes of member variables such as model_kwargs
are more obvious. But I am not sure whether the current function naming and pipeline splitting method is completely reasonable.
Please review the latest code and give me some suggestions. Thanks a lot!
Hi, @winglian Could you help review the latest code in this PR? Let me know if the breaks brings by refactoring of the original code is not you want.
Just FYI, I accidentally deleted the original commit, and it cann be found in this branch.
Description
Enable Ascend NPU backend for finetuning, inferencing and gradio webui. Main changes:
device
Motivation and Context
There are two benefits:
Example
Screenshots
NPU supported CLI inference
NPU supported Gradio webui inference
Config
lora.yaml