Closed tontan1998 closed 1 year ago
you can set repetition_penalty on actor, and adjust temperature, top_k and top_p to improve the sample result
actor = TextRLActor(env, model, tokenizer,
act_deterministically=False, # select the max probability token for each step or not
temperature=1, # temperature for sampling
compare_sample=2, # num of sample to rank
top_k=0, # top k sampling
top_p=1.0, # top p sampling
repetition_penalty=2) # repetition penalty from CTRL paper (https://arxiv.org/abs/1909.05858)
Thank you!
Hello! Thank you for awesome project. I found some model that generating repeated/duplicate text/sentences with TextRL.
Can you add the code that remove repeated/duplicate text/sentences from GPT-2 model with temperature?