Revise interface of generate method in lstm and cnn models

medduk9871 commented 6 months ago

input_ids => torch.Size([n])
predicted_indices[0] => torch.Size([n])

medduk9871 commented 6 months ago

predicted_indices[0] => torch.Size([n])

The format is correct but we need the logits and not the most likely token. Can you please output "next_token_probs" instead of torch.max..?

is it enough just for next 1 token? Because this is outputs['logits'] shape torch.Size([1, 636, 32128])

romsto commented 6 months ago

predicted_indices[0] => torch.Size([n])

The format is correct but we need the logits and not the most likely token. Can you please output "next_token_probs" instead of torch.max..?

is it enough just for next 1 token? Because this is outputs['logits'] shape torch.Size([1, 636, 32128])

Then I just misunderstood the torch.max method. As long as you have the logits for the new token it's what specdec needs :)

medduk9871 commented 6 months ago

predicted_indices[0] => torch.Size([n])

The format is correct but we need the logits and not the most likely token. Can you please output "next_token_probs" instead of torch.max..?

is it enough just for next 1 token? Because this is outputs['logits'] shape torch.Size([1, 636, 32128])

Then I just misunderstood the torch.max method. As long as you have the logits for the new token it's what specdec needs :)

predicted_indices[0] => torch.Size([n])

The format is correct but we need the logits and not the most likely token. Can you please output "next_token_probs" instead of torch.max..?

is it enough just for next 1 token? Because this is outputs['logits'] shape torch.Size([1, 636, 32128])

Then I just misunderstood the torch.max method. As long as you have the logits for the new token it's what specdec needs :)

you understood well, the info in the previous comment is not related with max, So I should revise the output. My question is if the output should be just logits for 1 token or something like this tensor torch.Size([1, 636, 32128])?

ArtemisDicoTiar / FastLLM

Revise interface of generate method in lstm and cnn models #41