renmada / t5-pegasus-pytorch

400 stars 61 forks source link

关于beam size的问题 #6

Closed wutong4012 closed 3 years ago

wutong4012 commented 3 years ago

请问模型中的model.generate是调用相关的库么? 如果是的话,应该怎么修改beam size呀? 好像没有看到beam search的相关代码。

renmada commented 3 years ago

是transformers的模型自带的类方法,参数有 @torch.no_grad() def generate( self, input_ids: Optional[torch.LongTensor] = None, max_length: Optional[int] = None, min_length: Optional[int] = None, do_sample: Optional[bool] = None, early_stopping: Optional[bool] = None, num_beams: Optional[int] = None, temperature: Optional[float] = None, top_k: Optional[int] = None, top_p: Optional[float] = None, repetition_penalty: Optional[float] = None, bad_words_ids: Optional[Iterable[int]] = None, bos_token_id: Optional[int] = None, pad_token_id: Optional[int] = None, eos_token_id: Optional[int] = None, length_penalty: Optional[float] = None, no_repeat_ngram_size: Optional[int] = None, encoder_no_repeat_ngram_size: Optional[int] = None, num_return_sequences: Optional[int] = None, max_time: Optional[float] = None, decoder_start_token_id: Optional[int] = None, use_cache: Optional[bool] = None, num_beam_groups: Optional[int] = None, diversity_penalty: Optional[float] = None, prefix_allowed_tokens_fn: Optional[Callable[[int, torch.Tensor], List[int]]] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, output_scores: Optional[bool] = None, return_dict_in_generate: Optional[bool] = None, forced_bos_token_id: Optional[int] = None, forced_eos_token_id: Optional[int] = None, remove_invalid_values: Optional[bool] = None, synced_gpus: Optional[bool] = None, **model_kwargs, ) -> Union[GreedySearchOutput, SampleOutput, BeamSearchOutput, BeamSampleOutput, torch.LongTensor]: r""" Generates sequences for models with a language modeling head. The method currently supports greedy decoding, multinomial sampling, beam-search decoding, and beam-search multinomial sampling. Apart from :obj:input_idsand :obj:attention_mask, all the arguments below will default to the value of the attribute of the same name inside the :class:~transformers.PretrainedConfigof the model. The default values indicated are the default values of those config. Most of these parameters are explained in more detail inthis blog post https://huggingface.co/blog/how-to-generate__. Parameters: input_ids (:obj:torch.LongTensorof shape :obj:(batch_size, sequence_length),optional): The sequence used as a prompt for the generation. If :obj:Nonethe method initializes it as an empty :obj:torch.LongTensorof shape :obj:(1,). max_length (:obj:int,optional, defaults to 20): The maximum length of the sequence to be generated. min_length (:obj:int,optional, defaults to 10): The minimum length of the sequence to be generated. do_sample (:obj:bool,optional, defaults to :obj:False): Whether or not to use sampling ; use greedy decoding otherwise. early_stopping (:obj:bool,optional, defaults to :obj:False): Whether to stop the beam search when at least ``num_beams`` sentences are finished per batch or not. num_beams (:obj:int,optional, defaults to 1): Number of beams for beam search. 1 means no beam search. temperature (:obj:float,optional, defaults to 1.0): The value used to module the next token probabilities. top_k (:obj:int,optional, defaults to 50): The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (:obj:float,optional, defaults to 1.0): If set to float < 1, only the most probable tokens with probabilities that add up to :obj:top_por higher are kept for generation. repetition_penalty (:obj:float,optional, defaults to 1.0): The parameter for repetition penalty. 1.0 means no penalty. Seethis paper https://arxiv.org/pdf/1909.05858.pdf__ for more details. pad_token_id (:obj:int,optional): The id of thepaddingtoken. bos_token_id (:obj:int,optional): The id of thebeginning-of-sequencetoken. eos_token_id (:obj:int,optional): The id of theend-of-sequencetoken. length_penalty (:obj:float,optional, defaults to 1.0): Exponential penalty to the length. 1.0 means no penalty. Set to values < 1.0 in order to encourage the model to generate shorter sequences, to a value > 1.0 in order to encourage the model to produce longer sequences. no_repeat_ngram_size (:obj:int,optional, defaults to 0): If set to int > 0, all ngrams of that size can only occur once. encoder_no_repeat_ngram_size (:obj:int,optional, defaults to 0): If set to int > 0, all ngrams of that size that occur in the ``encoder_input_ids`` cannot occur in the ``decoder_input_ids``. bad_words_ids(:obj:List[List[int]],optional): List of token ids that are not allowed to be generated. In order to get the tokens of the words that should not appear in the generated text, use :obj:tokenizer(bad_word, add_prefix_space=True).input_ids. num_return_sequences(:obj:int,optional, defaults to 1): The number of independently computed returned sequences for each element in the batch. max_time(:obj:float,optional, defaults to None): The maximum amount of time you allow the computation to run for in seconds. generation will still finish the current pass after allocated time has been passed. attention_mask (:obj:torch.LongTensorof shape :obj:(batch_size, sequence_length),optional): Mask to avoid performing attention on padding token indices. Mask values are in ``[0, 1]``, 1 for tokens that are not masked, and 0 for masked tokens. If not provided, will default to a tensor the same shape as :obj:input_idsthat masks the pad token.What are attention masks? <../glossary.html#attention-mask>__ decoder_start_token_id (:obj:int,optional): If an encoder-decoder model starts decoding with a different token thanbos, the id of that token. use_cache: (:obj:bool,optional, defaults to :obj:True): Whether or not the model should use the past last key/values attentions (if applicable to the model) to speed up decoding. num_beam_groups (:obj:int,optional, defaults to 1): Number of groups to divide :obj:num_beamsinto in order to ensure diversity among different groups of beams.this paper https://arxiv.org/pdf/1610.02424.pdf__ for more details. diversity_penalty (:obj:float,optional, defaults to 0.0): This value is subtracted from a beam's score if it generates a token same as any beam from other group at a particular time. Note that :obj:diversity_penaltyis only effective if ``group beam search`` is enabled. prefix_allowed_tokens_fn: (:obj:Callable[[int, torch.Tensor], List[int]],optional): If provided, this function constraints the beam search to allowed tokens only at each step. If not provided no constraint is applied. This function takes 2 arguments: the batch ID :obj:batch_idand :obj:input_ids. It has to return a list with the allowed tokens for the next generation step conditioned on the batch ID :obj:batch_idand the previously generated tokens :obj:inputs_ids. This argument is useful for constrained generation conditioned on the prefix, as described in Autoregressive Entity Retrieval https://arxiv.org/abs/2010.00904__. output_attentions (:obj:bool,optional, defaults toFalse): Whether or not to return the attentions tensors of all attention layers. See ``attentions`` under returned tensors for more details. output_hidden_states (:obj:bool,optional, defaults toFalse): Whether or not to return trhe hidden states of all layers. See ``hidden_states`` under returned tensors for more details. output_scores (:obj:bool,optional, defaults toFalse): Whether or not to return the prediction scores. See ``scores`` under returned tensors for more details. return_dict_in_generate (:obj:bool,optional, defaults toFalse): Whether or not to return a :class:~transformers.file_utils.ModelOutputinstead of a plain tuple. forced_bos_token_id (:obj:int,optional): The id of the token to force as the first generated token after the :obj:decoder_start_token_id. Useful for multilingual models like :doc:mBART <../model_doc/mbart>where the first generated token needs to be the target language token. forced_eos_token_id (:obj:int,optional): The id of the token to force as the last generated token when :obj:max_lengthis reached. remove_invalid_values (:obj:bool,optional): Whether to remove possiblenanandinfoutputs of the model to prevent the generation method to crash. Note that using ``remove_invalid_values`` can slow down generation. synced_gpus (:obj:bool,optional, defaults to :obj:False): Whether to continue running the while loop until max_length (needed for ZeRO stage 3) model_kwargs: Additional model specific kwargs will be forwarded to the :obj:forwardfunction of the model. If the model is an encoder-decoder model, encoder specific kwargs should not be prefixed and decoder specific kwargs should be prefixed withdecoder_. Return: :class:~transformers.file_utils.ModelOutputor :obj:torch.LongTensor: A :class:~transformers.file_utils.ModelOutput(if ``return_dict_in_generate=True`` or when ``config.return_dict_in_generate=True``) or a :obj:torch.FloatTensor. If the model isnotan encoder-decoder model (``model.config.is_encoder_decoder=False``), the possible :class:~transformers.file_utils.ModelOutput` types are:

具体见https://github.com/huggingface/transformers/blob/741d48f5c7bf0acdf9b40d0deb8560b997761f3a/src/transformers/generation_utils.py