thunlp / OpenPrompt

An Open-Source Framework for Prompt-Learning.
https://thunlp.github.io/OpenPrompt/
Apache License 2.0
4.38k stars 455 forks source link

How to reproduce ProtoVerb with single verbalizer? #228

Closed Dousia closed 1 year ago

Dousia commented 1 year ago

Hi, I am very interested in your work Prototypical Verbalizer for Prompt-based Few-shot Tuning.

When trying to reproduce your experiment results, I am confused because it seems to be no proper script for ProtoVerb as Single Verbalizer. In openprompt/protoverb_trainer.py, line 147-157 seems to train the model with manual verbalizer regardless of input parameters:

        for self.cur_epoch in range(self.cur_epoch, self.num_epochs):
            continue_training = self.training_epoch(self.cur_epoch)
            score = self.inference_epoch("validation")
            copy = None
            if self.best_score is None or ((score - self.best_score) >= 0) == self.config.checkpoint.higher_better:
                copy = 'best'
                self.best_score = score
            self.save_checkpoint('last', extra={"validation_metric": score}, copy = copy)
            if continue_training == -1:
                logger.info("Stop training by reaching maximum num_training_steps")
                break

Is it okay to delete these lines and set train_verblizer mode as "post" for ProtoVerb as Single Verbalizer experiment?

Thank you in advance.

cgq15 commented 1 year ago

Hi,

If you want to use ProtoVerb as a single verbalizer, just set multi_verb: proto in the config yaml. https://github.com/thunlp/OpenPrompt/blob/17283c194bf76fd4c06fa89bdf7f947026e53e68/experiments/classification_proto_verbalizer.yaml#L36-L43

In openprompt/prompts/prototypical_verbalizer.py we use this parameter to control the usage of multiple verbalizers. https://github.com/thunlp/OpenPrompt/blob/17283c194bf76fd4c06fa89bdf7f947026e53e68/openprompt/prompts/prototypical_verbalizer.py#L264-L273

Dousia commented 1 year ago

Thank you very much for your reply~

I followed your instruction and set multi_verb: proto in the configuration file, but it turns out that the process_outputs function still returns manual_logits to train the model in the code I mentioned above. It seems that this parameter just controls the verbalizer used in test stage, but manual template is still involved in model training procedure.

Am I misunderstanding something or making mistakes in reproduction?

Thank you in advance~

cgq15 commented 1 year ago

You also need to set train_verbalizer as alternate to update prototypes alternatively. Sorry for missing that. https://github.com/thunlp/OpenPrompt/blob/17283c194bf76fd4c06fa89bdf7f947026e53e68/openprompt/protoverb_trainer.py#L158-L159

Dousia commented 1 year ago

Than you for your reminder~

This parameter seems to be controlling the strategy to update prototype embeddings, but it does not influence the behaviour of process_outputs function triggered by line 148 in OpenPrompt/openprompt/protoverb_trainer.py., where manual template is still used to produce manual_logits.

https://github.com/thunlp/OpenPrompt/blob/17283c194bf76fd4c06fa89bdf7f947026e53e68/openprompt/protoverb_trainer.py#L148

May I know if this line of code still introduces manual template information into training stage?

Thank you in advance~

cgq15 commented 1 year ago

In process_outputs function, if self.trained and self.multi_verb == "proto", the function will only return proto_logits which do not use manual verbalizers. Setting train_verbalizer as alternate will first train prototypes before tuning the model, and set self.trained as True. Therefore, by controlling these two parameters, manual verbalizers will not be used during tuning.

Hope this can help!

Dousia commented 1 year ago

I re-run the code with train_verbalizer=alternate, and find out that manual template information is indeed excluded. My confusion is settled.

Thank you very much for your patience and detailed explanation~