Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
MIT License
1.29k
stars
81
forks
source link
RuntimeError: expected scalar type Float but found Half #215
Found an issue with loading the Salesforcet5/codet5-large-ntp-py model.
basaran_1 | ERROR:waitress:Exception while serving /v1/completions
basaran_1 | Traceback (most recent call last):
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/waitress/channel.py", line 428, in service
basaran_1 | task.service()
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/waitress/task.py", line 168, in service
basaran_1 | self.execute()
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/waitress/task.py", line 456, in execute
basaran_1 | for chunk in app_iter:
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/werkzeug/wsgi.py", line 289, in __next__
basaran_1 | return self._next()
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/werkzeug/wrappers/response.py", line 32, in _iter_encoded
basaran_1 | for item in iterable:
basaran_1 | File "/app/basaran/__main__.py", line 187, in stream
basaran_1 | for choice in stream_model(**options):
basaran_1 | File "/app/basaran/model.py", line 73, in __call__
basaran_1 | for (
basaran_1 | File "/app/basaran/model.py", line 215, in generate
basaran_1 | kwargs["encoder_outputs"] = encoder(**encoder_kwargs)
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1423, in _call_impl
basaran_1 | return forward_call(*input, **kwargs)
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/accelerate/hooks.py", line 165, in new_forward
basaran_1 | output = old_forward(*args, **kwargs)
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/transformers/models/t5/modeling_t5.py", line 1090, in forward
basaran_1 | layer_outputs = layer_module(
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1423, in _call_impl
basaran_1 | return forward_call(*input, **kwargs)
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/accelerate/hooks.py", line 165, in new_forward
basaran_1 | output = old_forward(*args, **kwargs)
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/transformers/models/t5/modeling_t5.py", line 693, in forward
basaran_1 | self_attention_outputs = self.layer[0](
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1423, in _call_impl
basaran_1 | return forward_call(*input, **kwargs)
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/accelerate/hooks.py", line 165, in new_forward
basaran_1 | output = old_forward(*args, **kwargs)
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/transformers/models/t5/modeling_t5.py", line 599, in forward
basaran_1 | normed_hidden_states = self.layer_norm(hidden_states)
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1423, in _call_impl
basaran_1 | return forward_call(*input, **kwargs)
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/accelerate/hooks.py", line 165, in new_forward
basaran_1 | output = old_forward(*args, **kwargs)
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/apex/normalization/fused_layer_norm.py", line 386, in forward
basaran_1 | return fused_rms_norm_affine(input, self.weight, self.normalized_shape, self.eps)
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/apex/normalization/fused_layer_norm.py", line 189, in fused_rms_norm_affine
basaran_1 | return FusedRMSNormAffineFunction.apply(*args)
basaran_1 | File "/usr/local/lib/python3.8/dist-packages/apex/normalization/fused_layer_norm.py", line 69, in forward
basaran_1 | output, invvar = fused_layer_norm_cuda.rms_forward_affine(
basaran_1 | RuntimeError: expected scalar type Float but found Half
I've forked this repo and added a fix, however I think it breaks every other model out there, so I didn't make a PR.
I can still create a PR if you'd like me to.
Found an issue with loading the
Salesforcet5/codet5-large-ntp-py
model.I've forked this repo and added a fix, however I think it breaks every other model out there, so I didn't make a PR. I can still create a PR if you'd like me to.
https://github.com/lvnvceo/basaran/commit/61c1d4131e6de5798166e6ccab72f5e865a4fcab