Open Stochastic-Adventure opened 3 years ago
I will look at these two models and get back to you when I add them or found a way to add them.
For ELECTRA, you can manually extract the pooled output representation of [CLS] (from HuggingFace https://huggingface.co/transformers/_modules/transformers/models/electra/modeling_electra.html#ElectraForSequenceClassification) as
discriminator_hidden_states = self.electra(
input_ids,
attention_mask,
token_type_ids,
position_ids,
head_mask,
inputs_embeds,
output_attentions,
output_hidden_states,
return_dict,
)
seq_output = discriminator_hidden_states[0]
pooled_output = seq_output[:, 0, :]
Hi,
I'm wondering how to add ELECTRA and GPT2 support to this module.
Neither ELECTRA nor GPT2 has pooled output, unlike BERT/RoBERTa-based model.
I noticed in the
models.py
the model is implemented as following:There are no
pooled_output
for ELECTRA/GPT2 sequence classification models, onlyseq_output
is in theoutputs
variable.How to get around this limitation and get a working version of ELECTRA/GPT2? Thank you!