Closed MatthiasEg closed 1 year ago
@MatthiasEg we dont use Tag.TEXT
. If you want to feed text features to TF4Rec model, you need to convert your text features to a numerical representation (embeddings would be better) first (using BERT, or GPT2, or whatever model u want to use), then feed to the model as pre-trained embeddings. You can check out this unit test example.
Please note that cuDF DOES support string based data
but that depends on if your string data is long, and you have a large dataset.
@MatthiasEg I am closing this ticket due to low activity. Please reopen if needed.
❓ Questions & Help
Details
Hello everybody,
I'm trying to model sequential data with various properties, one of which is a text field (1-10 words). What is the intended process to include such text fields with Transformers4Rec? I have seen that there is schema tag (Tag.TEXT), but this alone is of no use, as apparently cuDF does not support string based data at this time. Should therefore the text already be tokenized in advance and then Tag.TOKENIZED be added additionally?
Any help is greatly appreciated!