Formatting training text files for discriminator training script

alisawuffles commented 3 years ago

Hi there, I hope to try discriminator-based PPLM with different sizes of GPT2. To do this, I believe we need to retrain the discriminator with a different embedding size using the paper_code/gpt2tunediscrim.py script. (Please correct me if I'm wrong here!) However, I am a little unclear on how the training text files should be formatted to be compatible with this code. It looks like each line in toxic_train.txt is processed with eval(d) to become a dictionary or json-like object with the keys 'text' and 'label'. Here is the excerpt of code I am looking at:

with open("datasets/toxic/toxic_train.txt") as f:
    data = []
    for d in f:
        data.append(eval(d))

x = []
y = []
for d in data:
    try:
        # seq = tokenizer.encode("Apple's iOS 9 'App thinning' feature will give your phone's storage a boost")
        seq = tokenizer.encode(d["text"])

        device = 'cuda'
        if(len(seq)<100):
            seq = torch.tensor([50256] + seq, device=device, dtype=torch.long)
        else:
            continue
        x.append(seq)
        y.append(int(np.sum(d['label'])>0))
    except:
        pass

Is there any chance you can share your training text files (e.g. datasets/toxic/toxic_train.txt) or the script you used to create the text files from the original datasets? Thank you!

andreamad8 commented 3 years ago

Hi,

yes, correct, for larger GPT2 need to retrain the discriminator.

The toxic_train.txt is a plain text file with dictionary per row.

{"text": "sometext", "label":[0]}
{"text": "sometext", "label":[1]}

so we use eval(d) to convert the string to a dictionary.

Then we used int(np.sum(d['label'])>0) because the dataset was made for multiclass classification for distinguishing a different kind of toxicity. So the label was [0,1,0,0] etc.

I hope this help.

let me know if you have other doubts.

Andrea

alisawuffles commented 3 years ago

Hi Andrea,

Thank you for the response! To clarify, does that mean that an example is counted as a positive example (i.e. toxic) if any one of the toxicity subcategories (toxic, severe_toxic, obscene, threat, insult, identity_hate) is 1 in the original dataset? Or is it a positive example if and only if toxic is 1?

Alisa

andreamad8 commented 3 years ago

Hi,

in our experiment the first, meaning: if any one of the toxicity subcategories (toxic, severe_toxic, obscene, threat, insult, identity_hate) is 1 then we consider it toxic. I think it is possible thou to train the discriminator for a specific subcategory 😊

Andrea

alisawuffles commented 3 years ago

Hi Andrea,

Thank you for your help!! I have trained a discriminator with a larger embedding size compatible with GPT2-large, and am able to use this discriminator to steer generations from GPT2-large.

Would you mind double-checking that I trained the discriminator in the right way? I used the command

python -m run_pplm_discrim_train --dataset=toxic --save_model

Crucially, I used run_pplm_discrim_train.py instead of paper_code/gpt2_tunediscrim.py, and all of the default settings (10 epochs, learning rate 0.0001, batch size 64). Are these the same settings with which the toxicity discriminator was trained in the paper? I want to be sure, as I will be comparing to PPLM as a baseline in some experiments.

Thanks again!

Alisa

andreamad8 commented 3 years ago

Hi Alisa,

to the best of my knowledge, yes. Maybe @dathath can confirm.

In general, I cannot guarantee that these hyperparameters work best also for GPT2-large. The best way is to check a bit the generated text and the accuracy of the discriminator. Sometimes, we experienced that a strong discriminator (high accuracy in the test set) was suboptimal in the generation.

For your paper, I guess using this setting should be okay 😊

I hope this helps.

Andrea

uber-research / PPLM

Formatting training text files for discriminator training script #33