Open jondot opened 10 months ago
Hello @jondot ,
You are correct this is currently missing. It is probably straightforward to add and I could publish an update, unless you would like to work on this?
I haven't yet studied the mechanics of implementing such a thing, I can't manage to keep up (just learned about ELECTRA's existence via a model I need to use), I'm wondering if this is something you can point some learning material for me and I'd have a high chances of success?
@guillaume-be from what I could study from the code, having a TokenClassifier already -- I would just copy it and change the copied structs and such to SequenceClassfier, the rest is again just copying around, there's no new logic that separates a Token and Sequence classifier other than the stuff in the pipeline. Inside the models it's all similar. So having that Electra has a token classifier, the job is already done for the sequence classifier. Is that a correct assumption?
I would recommend checking a combination of the following resources:
ElectraClassificationHead
submodule.SequenceClassification
model implementation in this crate such as DebertaV2ForSequenceClassification
. This will give some high level guidance on how to create a new Struct
with submodules as needed to reflect the Python implementation.If you have a candidate model that is useful it would also be great to publish these weights and register them in the pretrained models -- this would allow adding a test for your implementation
I've been reading the code, but unsure if sequence classification is supported in some undocumented way with ELECTRA?