gretelai / gretel-synthetics

Synthetic data generators for structured and unstructured text, featuring differentially private learning.
https://gretel.ai/platform/synthetics
Other
590 stars 87 forks source link

Aw/core 84 - Auto-select Tokenizer #91

Closed zredlined closed 3 years ago

zredlined commented 3 years ago

Automatically select character-based tokenization over SentencePiece if vocab size is set to zero.