alibaba / rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
Apache License 2.0
544 stars 50 forks source link

您好,I'd like to ask a question that might not be very professional. In the code, the weights are loaded through Python. Where are they passed to the C++(fasttransformer) part? #86

Closed samaritan1998 closed 4 months ago

netaddi commented 4 months ago

https://github.com/alibaba/rtp-llm/blob/04fe4dafe5d204d14ec41f1b2ab0212398751d4b/maga_transformer/ops/rtp_llm/rtp_llm_op.py#L21