Closed tengomucho closed 1 month ago
Note this can be an alternative to https://github.com/AI-Hypercomputer/jetstream-pytorch/pull/191.
lgtm, please run pyink --pyink-indentation 2 --line-length 80 --verbose --extend-exclude=deps .
to format the files
Instead of internal_quantize_embedding_layer, add a flag that will allow to specify a list of layers. This can be a flexible solution to avoid quantizing a given list of layers, the Embedding layer but even few more if required.