How to rerank fine-tuned DialoGPT outputs with DialogRPT using HuggingFace Transformers?

tsutsen commented 3 years ago

I am not satisfied with the responses that DialoGPT produces -- for the most part, they seem pretty random and AI-ish to me. I fine-tuned the model on my dataset using Transformers' Trainer but that did not help much – the responses are often just quotes from the dataset out of context. I want these quotes to be relevant at least, so I decided to try DialogRPT human-vs-rand and human-vs-machine.

The problem is I do not understand how to rerank DialoGPT responses with DialogRPT using Transformers. Should I use DialogRPT during fine-tuning to compute loss? Or maybe it is possible to connect it as a LogitsProcessor? If yes, then how? As I understand, Transformers' generate() method outputs scores for every token but DialogRPT outputs a single number. How can I modify the scores of a response then?

I am new to machine learning and this stuff is quite overwhelming for me; any help is very appreciated!

golsun commented 3 years ago

hi @tsutsen ,

The simplest way is to use DialogRPT to rank the hypothese generated by DialoGPT, this is one example of implementation. and you can try this to play with that. Please let me know if you get any questions.

tsutsen commented 3 years ago

Thank you for the quick reply, @golsun !

I've tried to adapt the code from GPT2Generator and Integrated to get it working with Transformers' model card but I get the error 'GPT2ForSequenceClassification' object has no attribute 'predict' on the line scores_ranker = self.ranker.predict(cxt, hyps)

Which predict method should I use for the ranker? The one in model.py or the one in score.py?

The code:


model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-small")
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-small")

modelRPT = AutoModelForSequenceClassification.from_pretrained('microsoft/DialogRPT-human-vs-machine')
tokenizerRPT = AutoTokenizer.from_pretrained('microsoft/DialogRPT-human-vs-machine')

class Generator:

    def __init__(self):
        self.tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-small")       
        self.model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-small")
        self.ix_EOS = tokenizer.eos_token_id
        self.model.eval()
        self.cuda = True
        self.model.cuda()

    def tokenize(self, cxt):
        turns = cxt.split(tokenizer.eos_token)
        ids = []
        for turn in turns:
            ids += self.tokenizer.encode(turn.strip()) + [self.ix_EOS]
        ids = torch.tensor([ids]).view(1, -1)
        if self.cuda:
            ids = ids.cuda()
        return ids

    def predict_sampling(self, cxt, temperature=0.7, n_hyp=5, max_t=30):
        """ sampling tokens based on predicted probability """

        tokens = self.tokenize(cxt)
        tokens = tokens.repeat(n_hyp, 1)
        len_cxt = tokens.shape[1]
        sum_logP = [0] * n_hyp
        live = [True] * n_hyp
        seqs = [[] for _ in range(n_hyp)]
        np.random.seed(2020)
        for _ in range(max_t):
            outputs = self.model(tokens)
            predictions = outputs[0]
            prob = torch.softmax(predictions[:, -1, :] / temperature, dim=-1)
            if self.cuda:
                prob = prob.cpu()
            prob = prob.detach().numpy()
            vocab = prob.shape[-1]
            next_tokens = []
            for i in range(n_hyp):
                next_token = np.random.choice(vocab, p=prob[i,:])
                next_tokens.append(next_token)
                if not live[i]:
                    continue
                sum_logP[i] += np.log(prob[i, next_token])
                seqs[i].append(next_token)
                if next_token == self.ix_EOS:
                    live[i] = False
                    continue
            next_tokens = torch.LongTensor(next_tokens).view(-1, 1)
            if self.cuda:
                next_tokens = next_tokens.cuda()
            tokens = torch.cat([tokens, next_tokens], dim=-1)

        ret = []
        for i in range(n_hyp):
            if live[i]:     # only return hyp that ends with EOS
                continue
            prob = np.exp(sum_logP[i] / (len(seqs[i]) + 1))
            hyp = self.tokenizer.decode(seqs[i][:-1])   # strip EOS
            ret.append((prob, hyp))
        return ret

    def play(self, params):
        while True:
            cxt = input('\nContext:\t')
            if not cxt:
                break
            ret = self.predict(cxt, **params)
            for prob, hyp in sorted(ret, reverse=True):
                print('%.3f\t%s'%(prob, hyp))

class Integrated:
    def __init__(self, generator, ranker):
        self.generator = generator
        self.ranker = ranker

    def predict(self, cxt, wt_ranker, params):
        with torch.no_grad():
            prob_hyp = self.generator.predict_sampling(cxt, **params)
        probs = np.array([prob for prob, _ in prob_hyp])
        hyps = [hyp for _, hyp in prob_hyp]
        if wt_ranker > 0:
            scores_ranker = self.ranker.predict(cxt, hyps)
            if isinstance(scores_ranker, dict):
                scores_ranker = scores_ranker['final']
            scores = wt_ranker * scores_ranker + (1 - wt_ranker) * probs
        else:
            scores = probs
        ret = []
        for i in range(len(hyps)):
            ret.append((scores[i], probs[i], scores_ranker[i], hyps[i]))
        ret = sorted(ret, reverse=True)
        return ret

generator = Generator()
params = {'temperature': 0.7, 'n_hyp': 5}

Integrated(generator, modelRPT).predict('How are you?', 1, params)

The error:

AttributeError                            Traceback (most recent call last)
<ipython-input-11-ef5210f29a1f> in <module>
    103 params = {'temperature': 0.7, 'n_hyp': 5}
    104 
--> 105 Integrated(generator, modelRPT).predict('How are you?', 1, params)

<ipython-input-11-ef5210f29a1f> in predict(self, cxt, wt_ranker, params)
     88         hyps = [hyp for _, hyp in prob_hyp]
     89         if wt_ranker > 0:
---> 90             scores_ranker = self.ranker.predict(cxt, hyps)
     91             if isinstance(scores_ranker, dict):
     92                 scores_ranker = scores_ranker['final']

/usr/lib/python3.9/site-packages/torch/nn/modules/module.py in __getattr__(self, name)
    945             if name in modules:
    946                 return modules[name]
--> 947         raise AttributeError("'{}' object has no attribute '{}'".format(
    948             type(self).__name__, name))
    949 

AttributeError: 'GPT2ForSequenceClassification' object has no attribute 'predict'

UPD. Nevermind, I understood how to implement it. Will put it into Colab and share a link for others in a few days.

OlegBEZb commented 3 years ago

@tsutsen, any updates?

dayuyang1999 commented 2 years ago

this

Can you share the colab code? Thank you so much!

golsun / DialogRPT

How to rerank fine-tuned DialoGPT outputs with DialogRPT using HuggingFace Transformers? #8