Closed stefan-it closed 3 years ago
Update on that: model and checkpoints are released:
https://github.com/google-research/language/tree/master/language/canine
:hugs:
Hi @stefan-it, thanks for the update. Do you know how we can use those pre-trained tensorflow checkpoints to get the pooled text representations from CANINE model? Thanks
any updates on this ?
Hi,
I've started working on this. Forward pass in PyTorch is working, and giving me the same output tensors as the TF implementation on the same input data.
Will open a PR soon
Hi @dhgarrette,
I don't want to spam the CANINE PR with this question/discussion, so I'm asking it here in this issue 😅
So I would like to use CANINE for token classification (I'm currently implementing it into Flair framwork...), and for that reason tokenized input is passed to the model. For token classification using e.g. BERT one would use the first subword as "pooling strategy". But when using CANINE and following the subword "analogy", using the embedding of the first - let's say - character is a good strategy (instead of e.g. mean
) 🤔
🌟 New model addition
Model description
Google recently proposed a new Character Architecture with No tokenization In Neural Encoders architecture (CANINE). Not only the title is exciting:
Overview of the architecture:
Paper is available here.
We heavily need this architecture in Transformers (RIP subword tokenization)!
The first author (Jonathan Clark) said on Twitter that the model and code will be released in April :partying_face:
Open source status