Open freyaya123 opened 6 months ago
Hello,
1/ It seems like HF return the score of all vocab in each step. Otherwise, Ctranslate2 calculate the sum of the highest score of each step. 3 in your case is the batch size.
2/ Could you set the include_eos_in_hypotheses
to True ? The eos token should be added at the end.
Hello, 1/ It seems like HF return the score of all vocab in each step. Otherwise, Ctranslate2 calculate the sum of the highest score of each step. 3 in your case is the batch size. 2/ Could you set the
include_eos_in_hypotheses
to True ? The eos token should be added at the end.
What do you mean by "Ctranslate2 calculate the sum of the highest score of each step"? for example, if we assume bs=1 HF score: seq_len*[1,vocab] ctranslate2 score: a list of len 1. [Num]
What is the Num equal to?
For example bs = 1. HF score: seq_len x 1 x vocab. Otherwise, Ctranslate2 have shape: 1: (max score in vocab) of token 1 + (max score in vocab) of token 2 + ... + (max score in vocab) of token seq_len . If you want to get the max score for each token. You can use the async function and then get score of each token.
For example bs = 1. HF score: seq_len x 1 x vocab. Otherwise, Ctranslate2 have shape: 1: (max score in vocab) of token 1 + (max score in vocab) of token 2 + ... + (max score in vocab) of token seq_len . If you want to get the max score for each token. You can use the async function and then get score of each token.
Thank you! Another question, according to the autoregressive score after linear layer and chain rule, why is sum
here rather than product
? --P(x1)P(x2|x1)P(x3|x1,x2)...P(xn|x1,x2,...x_n-1)=P(x1,x2,...xn), if we want to calculate the score of generated sequence. I remember there is no log
operation in the returned HF score.
In ctranslate2, the score after each step is the log-likelihood score. That's why we do the sum.
Oh I see! Thanks so much!
include_eos_in_hypotheses
Sorry I don't find the parameter include_eos_in_hypotheses
in the generate_batch function, where to set this?
Hi, I'm new to ctranslate2, and I'm confused about the scores returned by generator.generate_batch() function. What's the coresponding meaning of the scores in the huggingface generate() function?
For example,
in hf generation:
But if I use ctranslate2, for example:
I will get a list of len 3 for step_results[0].scores
And I also noticed that there is another function in hf:
which is really different from the scores in
step_results
.So I have two questions here:
generated_outputs.scores
,transition_scores
andstep_results[0].scores
?hf