Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.com/AI4Bharat/IndicBERT
I'm trying to build a machine translation model using the indicBERT model as an embedding. I'm able to obtain token embeddings from a tokenized sentence as follows:
I'm trying to build a machine translation model using the indicBERT model as an embedding. I'm able to obtain token embeddings from a tokenized sentence as follows:
However, I'm unable to find a way to obtain token ids from these embeddings. How would I go about doing this?
Thanks! Vimal