For one of my applications, I need a fixed length embedding for each word. However, in the current api, we do not have an option to specify it. For example, if i encode two words "invoice" and "operator" using an embedding length of 50, the word invoice gets an array of 50*3 size as it has been split into 3 subwords, but the word operator will get an array of size 50 as it has not been split by the encoder. Is there a way to get an array of constant size? Please see the output below for your reference.
For one of my applications, I need a fixed length embedding for each word. However, in the current api, we do not have an option to specify it. For example, if i encode two words "invoice" and "operator" using an embedding length of 50, the word invoice gets an array of 50*3 size as it has been split into 3 subwords, but the word operator will get an array of size 50 as it has not been split by the encoder. Is there a way to get an array of constant size? Please see the output below for your reference.