It would be nice to allow fetching the token embeddings from a cross-encoding, which is necessary to implement systems such as retrieval augmented named entity recognition (RA-NER).
Ideally, it would be implemented via an endpoint akin to the /embed_all endpoint, but would take an additional argument which plays the role of the text_pair argument here.
In addition to the token embeddings, this new endpoint would return token_type_ids, so as to be able to distinguish which token embeddings represent tokens from which sequence (text or text_pair, in the parlance of Transformers tokenizers).
Additionally, I believe this would help round out the API, as this functionality is available in the transformers library but unavailable here.
An MWE of calling the endpoint as I would like to is as follows:
import asyncio
import aiohttp
async def main():
payload = {
"inputs": ["This is a query.", "This is a second query."],
"inputs_pair": ["This is a doc for query 1.", "This is a doc for query 2."],
}
session = aiohttp.ClientSession()
async with session.post(
"http://127.0.0.1:8080/embed_all_cross_encoding",
headers={"Content-Type": "application/json"},
json=payload,
) as response:
data = await response.json()
token_embeddings = data["token_embeddings"]
token_type_ids = data["token_type_ids"]
if __name__ == "__main__":
asyncio.run(main())
where token_embeddings is of shape batch_size * sequence_length * n_dims, and token_type_ids is of shape batch_size * sequence_length.
Motivation
Fetching token embeddings from a cross-encoding serves two purposes:
(i) Enables implementation of systems such as RA-NER.
(ii) Helps to round out the API, bringing functionality available in the Transformers library which is as of yet unavailable here.
Your contribution
I could contribute to examples and/or documentation. Thank you!
Feature request
It would be nice to allow fetching the token embeddings from a cross-encoding, which is necessary to implement systems such as retrieval augmented named entity recognition (RA-NER).
Ideally, it would be implemented via an endpoint akin to the
/embed_all
endpoint, but would take an additional argument which plays the role of thetext_pair
argument here.In addition to the token embeddings, this new endpoint would return
token_type_ids
, so as to be able to distinguish which token embeddings represent tokens from which sequence (text
ortext_pair
, in the parlance of Transformers tokenizers).Additionally, I believe this would help round out the API, as this functionality is available in the transformers library but unavailable here.
An MWE of calling the endpoint as I would like to is as follows:
where
token_embeddings
is of shapebatch_size * sequence_length * n_dims
, andtoken_type_ids
is of shapebatch_size * sequence_length
.Motivation
Fetching token embeddings from a cross-encoding serves two purposes:
(i) Enables implementation of systems such as RA-NER.
(ii) Helps to round out the API, bringing functionality available in the Transformers library which is as of yet unavailable here.
Your contribution
I could contribute to examples and/or documentation. Thank you!