quic / cloud-ai-sdk

Qualcomm Cloud AI SDK (Platform and Apps) enable high performance deep learning inference on Qualcomm Cloud AI platforms delivering high throughput and low latency across Computer Vision, Object Detection, Natural Language Processing and Generative AI models.
https://quic.github.io/cloud-ai-sdk-pages/latest/
Other
54 stars 6 forks source link

Upgrade transformer library support #9

Closed yh-yao closed 4 months ago

yh-yao commented 6 months ago

The current SDK only supports "transformer=4.32.0". The newer version of transformer library supports a few new things (e.g. tokenizer.apply_chat_template, streamer). Upgrading the library will save a lot of time on implementing those things in the SDK repo.

dapengsmith commented 6 months ago

Hello yh-yao, There is no fix mapping rule that SDK should map to specific transformers version. The only concern is that Qualcomm's patch are used for specific transformers version. In general, we only add patch for src/transformers/modeling_outputs.py and specific modeling_xxx.py. So you can just patch it manually. Then you can use any transformers version.

yh-yao commented 5 months ago

@dapengsmith I am trying to serve llama3-8b. It looks tricky to manually update the patching code. Since all of us want the Qualcomm SDK be used by more people, could you help me with updating the patch code?

quic-aashwins commented 4 months ago

https://github.com/quic/efficient-transformers is now available for your LLM execution needs on Qualcomm AI 100 accelerators.