OpenNMT / CTranslate2

Fast inference engine for Transformer models
https://opennmt.net/CTranslate2
MIT License
3.38k stars 300 forks source link

NO LOGITS RETURNS AFTER GENERATE #1779

Open LAnCeBabY opened 1 month ago

LAnCeBabY commented 1 month ago

after ctranslate2.models.Whisper.generate, result does not include "logits" version == 4.4.0

minhthuc2502 commented 1 month ago

Do you set the return_logits_vocab=True?

LAnCeBabY commented 1 month ago

Do you set the return_logits_vocab=True?

Yes. But I didn't find logits in WhisperGenerationResults

minhthuc2502 commented 1 month ago

Can you send me the code? I will test it.

LAnCeBabY commented 1 month ago

Can you send me the code? I will test it.

The code is modified from faster-whisper result = self.model.generate( encoder_output, [prompt], length_penalty=options.length_penalty, repetition_penalty=options.repetition_penalty, no_repeat_ngram_size=options.no_repeat_ngram_size, max_length=max_length, return_logits_vocab=True, return_scores=True, return_no_speech_prob=True, suppress_blank=options.suppress_blank, suppress_tokens=options.suppress_tokens, max_initial_timestamp_index=max_initial_timestamp_index, **kwargs, )

And I have printed the WhisperGenerationResult class. Seems no ‘logits’ in this class. >>> dir(ctranslate2.models.WhisperGenerationResult) ['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'no_speech_prob', 'scores', 'sequences', 'sequences_ids']

bogdanteleaga commented 1 month ago

Seems to be missing from the Python wrapper https://github.com/OpenNMT/CTranslate2/blob/v4.4.0/python/cpp/whisper.cc#L125

discojune commented 1 week ago

In version v4.5.0, logits is present in WhisperGenerationResult, but it contains an empty value: result.logits -> [[ [cpu:0 float32 storage viewed as ]]]

I'm wondering if this is expected or if it's possible to retrieve meaningful values for logits. Other transcription-related aspects are working fine.

thank you.