NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT
Apache License 2.0
5.66k stars 877 forks source link

GPTNeox decoding argumentation #713

Open w775739733 opened 1 year ago

w775739733 commented 1 year ago

Hello! I am using GPTNeox for decoding, and I need to pass in some parameters during forward propagation, such as length penalty and so on. I also use Transformers to call model.generate() for generation. When I use the same set of parameters, the results of the two are not consistent. I would like to know whether there are relevant documents describing the specific meaning and range of these parameters? Are they exactly the same as the meaning and scope of the parameters in Transformers?

w775739733 commented 1 year ago

The args: "gen_kwargs": { "max_new_tokens": 2048, "num_beams": 1,

"early_stopping": True,

        "do_sample": False,
        "temperature": 0.9, #0.35,
        "logits_processor": null,
        "top_k": 40,
        "repetition_penalty": 1.01,
        "length_penalty": 1.0,
        "eos_token_id": []
    },
RobotGF commented 11 months ago

"repetition_penalty": 1.01 have bug

hezeli123 commented 8 months ago

The logic of repetition_penalty in FT is not same with OPENAI description, How to use it ? OpenAI: https://platform.openai.com/docs/guides/gpt/managing-tokens mu[j] -> mu[j] - c[j] alpha_frequency - float(c[j] > 0) alpha_presence