salesforce / progen

Official release of the ProGen models
BSD 3-Clause "New" or "Revised" License
604 stars 111 forks source link

Sampling conditional token distribution #7

Open aiXander opened 2 years ago

aiXander commented 2 years ago

It would be super valuable to have an example script to sample conditional token probabilities for a target index given sequence context.

There seem to be some technical details that are important, but not easy to figure out:

Finally, the way I'm currently evaluating mutations is by sequentially computing sequence likelihoods for each possible mutated sequence, so this takes 20 forward passes per single point mutation. But I think this is vastly inefficient, since the model produces logits for every position, can the logits for the target index simply be used as a proxy for token probability?

a-mad commented 2 years ago

a couple notes: