Closed pcuenca closed 9 months ago
Is this really going to work for T5?
Yes, the pieces we've been building in this PR are meant to support a new UnigramTokenizer
class (see https://github.com/huggingface/transformers/blob/f40b87de0ca234df61f76928956c4a2118c0b548/src/transformers/models/t5/tokenization_t5_fast.py#L68-L69) that mimics the way the Rust tokenizers library works. Don't pay attention to this line, it will change once the new class is ready.
I've only been able to work sporadically on this PR, but most of the pieces are there. We should have something ready by the end of the week unless I find some obstacle.
I see thanks for the quick response. Is it done or it is still in progress? When do you think you will merge?
This is ready to test / review now.
Merging now. @hassanzadeh please, do let me know if you see anything unusual.
Hey Guys, Thanks for your support, May I ask what is your estimated time for this to be ready?
Best