ArtemisDicoTiar / FastLLM

1 stars 0 forks source link

Revise interface of generate method in lstm and cnn models #41

Closed medduk9871 closed 6 months ago

medduk9871 commented 6 months ago

medduk9871 commented 6 months ago
  • predicted_indices[0] => torch.Size([n])

The format is correct but we need the logits and not the most likely token. Can you please output "next_token_probs" instead of torch.max..?

is it enough just for next 1 token? Because this is outputs['logits'] shape torch.Size([1, 636, 32128]) image

romsto commented 6 months ago
  • predicted_indices[0] => torch.Size([n])

The format is correct but we need the logits and not the most likely token. Can you please output "next_token_probs" instead of torch.max..?

is it enough just for next 1 token? Because this is outputs['logits'] shape torch.Size([1, 636, 32128]) image

Then I just misunderstood the torch.max method. As long as you have the logits for the new token it's what specdec needs :)

medduk9871 commented 6 months ago
  • predicted_indices[0] => torch.Size([n])

The format is correct but we need the logits and not the most likely token. Can you please output "next_token_probs" instead of torch.max..?

is it enough just for next 1 token? Because this is outputs['logits'] shape torch.Size([1, 636, 32128]) image

Then I just misunderstood the torch.max method. As long as you have the logits for the new token it's what specdec needs :)

  • predicted_indices[0] => torch.Size([n])

The format is correct but we need the logits and not the most likely token. Can you please output "next_token_probs" instead of torch.max..?

is it enough just for next 1 token? Because this is outputs['logits'] shape torch.Size([1, 636, 32128]) image

Then I just misunderstood the torch.max method. As long as you have the logits for the new token it's what specdec needs :)

you understood well, the info in the previous comment is not related with max, So I should revise the output. My question is if the output should be just logits for 1 token or something like this tensor torch.Size([1, 636, 32128])?