[BUG] Pre-trained tas-b model won't auto-truncate the doc

What is the bug?

Our pre-trained tas-b model won't accept a doc with a token length exceeding 512

How can one reproduce the bug?

Simply upload our tas-b model (in ONNX form or torch_script form) into OpenSearch cluster and use it to embed a long doc will return you

"type": "m_l_exception",
    "reason": "m_l_exception: Failed to inference TEXT_EMBEDDING model: YdGX4oYBhpgQGXXi9WeO",
    "caused_by": {
      "type": "privileged_action_exception",
      "reason": "privileged_action_exception: null",
      "caused_by": {
        "type": "translate_exception",
        "reason": "translate_exception: ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter.\nTraceback of TorchScript, serialized code (most recent call last):
File \"code/__torch__/sentence_transformers/SentenceTransformer.py\", line 14, in forward\n    input_ids = input[\"input_ids\"]\n    mask = input[\"attention_mask\"]\n    _2 = (_0).forward(input_ids, mask, )\n          ~~~~~~~~~~~ <--- HERE\n    _3 = {\"input_ids\": input_ids, \"attention_mask\": mask, \"token_embeddings\": _2, \"sentence_embedding\": (_1).forward(_2, )}\n    return _3\n  File \"code/__torch__/sentence_transformers/models/Transformer.py\", line 11, in forward\n    mask: Tensor) -> Tensor:\n    auto_model = self.auto_model\n    _0 = (auto_model).forward(input_ids, mask, )\n          ~~~~~~~~~~~~~~~~~~~ <--- HERE\n    return _0\n  File \"code/__torch__/transformers/models/distilbert/modeling_distilbert.py\", line 13, in forward\n    transformer = self.transformer\n    embeddings = self.embeddings\n    _0 = (transformer).forward((embeddings).forward(input_ids, ), mask, )\n                                ~~~~~~~~~~~~~~~~~~~ <--- HERE\n    return _0\nclass Embeddings(Module):\n  File \"code/__torch__/transformers/models/distilbert/modeling_distilbert.py\", line 38, in forward\n    _3 = (word_embeddings).forward(input_ids, )\n    _4 = (position_embeddings).forward(input, )\n    input0 = torch.add(_3, _4)\n             ~~~~~~~~~ <--- HERE\n    _5 = (dropout).forward((LayerNorm).forward(input0, ), )\n    return _5\n\nTraceback of TorchScript, original code (most recent call last):\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/transformers/models/distilbert/modeling_distilbert.py(130): forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/transformers/models/distilbert/modeling_distilbert.py(578): forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/usr/local/lib/python3.9/site-packages/sentence_transformers/models/Transformer.py(66): forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/container.py(204): forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/jit/_trace.py(976): trace_module\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/jit/_trace.py(759): trace\n/Volumes/workplace/opensearch-py-ml/src/opensearch-py-ml/opensearch_py_ml/ml_models/sentencetransformermodel.py(778): save_as_pt\n/Volumes/workplace/opensearch-py-ml/src/opensearch-py-ml/test.py(34): <module>
RuntimeError: The size of tensor a (650) must match the size of tensor b (512) at non-singleton dimension 1",

What is the expected behavior?

Ideally the model is supposed to auto-truncate the doc exceeding a certain length.

opensearch-project / opensearch-py

[BUG] Pre-trained tas-b model won't auto-truncate the doc #346

What is the bug?

How can one reproduce the bug?

What is the expected behavior?