AstraZeneca / DiffAbXL

The official implementation of DiffAbXL benchmarked in the paper "Exploring Log-Likelihood Scores for Ranking Antibody Sequence Designs", formerly titled "Benchmarking Generative Models for Antibody Design".
Apache License 2.0
26 stars 2 forks source link

Log-Likelihood Calculation in DiffAbXL #4

Closed IrumHu closed 5 days ago

IrumHu commented 1 week ago

I have been exploring your DiffAbXL repository and found the section in the README about how to build an interface for benchmarking models. However, I couldn't find any specific code or reference for how DiffAbXL calculates these log-likelihood values. Could you please point me to the relevant part of the repository where this is handled?

talipucar commented 5 days ago

Hi @IrumHu

This repository was initially intended to contain only the model-related code, while our internal benchmarking pipeline is kept separate. However, I've added a script that demonstrates how to compute the log-likelihood as outlined in the paper (see below). Please note that the compute_loglikelihood method assumes that input variables such as sequence_tokens_list and posterior_list contain values for the specific positions where mutations occur. In other words, the sequences in sequence_tokens_list don’t need to be continuous segments of the entire sequence but should include tokens corresponding to the mutated positions. I will close this issue, but if you have any further questions about it, please feel free to re-open it.

https://github.com/AstraZeneca/DiffAbXL/blob/master/compute_loglikelihood.py