Store tf.data after the transformation needed for BERT in TFRecord files (preprocessing done only one)

tarrade / proj_multilingual_text_classification

Explore multilingal text classification using embedding, bert and deep learning architecture

Apache License 2.0

5 stars 2 forks source link

first implementation is done

assert cardinality doesn't exist with TF 2.1.0, waiting for 2.2.0 to clean up the code and made it a bit easier.

Big discussion about how to keep track of metadata with tf.data and TFRecord file ?

tarrade / proj_multilingual_text_classification