AI-Hypercomputer / jetstream-pytorch

PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
Apache License 2.0
41 stars 15 forks source link

feat: add quantize exclude layer flag #194

Closed tengomucho closed 1 month ago

tengomucho commented 1 month ago

Instead of internal_quantize_embedding_layer, add a flag that will allow to specify a list of layers. This can be a flexible solution to avoid quantizing a given list of layers, the Embedding layer but even few more if required.

tengomucho commented 1 month ago

Note this can be an alternative to https://github.com/AI-Hypercomputer/jetstream-pytorch/pull/191.

qihqi commented 1 month ago

lgtm, please run pyink --pyink-indentation 2 --line-length 80 --verbose --extend-exclude=deps . to format the files