brandontrabucco / design-bench

Benchmarks for Model-Based Optimization
MIT License
80 stars 19 forks source link

Completeness of TF Bind 8 Data in Design-bench #22

Open sqhang opened 3 months ago

sqhang commented 3 months ago

According to the Design-bench paper, the TF Bind 8 dataset by default includes binding affinity data for TF SIX6_REF_R1. I understand that the Design-bench dataset is currently undergoing migration. In the interim, I accessed a folder named tf_bind_8-SIX6_REF_R1 within the design_bench_data_incomplete.zip file from the Google Drive link shared by the repository maintainer. Could you confirm whether this folder contains the complete TF Bind 8 dataset as referenced in the Design-bench paper?

Thank you for your assistance!

christopher-beckham commented 3 months ago

Hi,

TFBind8's subvariant by default is tf_bind_8-SIX6_REF_R1, and this will be downloaded automatically in the new migration branch (chris/fixes-v2). Was there another version of TFBind8 you wanted to access?

Data I uploaded to my Google Drive in the past were various segments of the entire design_bench_data folder. But shortly after @brandontrabucco also uploaded data which appears to contain other variants of TFBind8. For his data I see about 200 .npy files seemingly corresponding to different versions, e.g. the first 10 are:

tf_bind_8-ARX_L343Q_R1-y-0.npy
tf_bind_8-ARX_L343Q_R2-y-0.npy
tf_bind_8-ARX_P353L_R1-y-0.npy
tf_bind_8-ARX_P353L_R2-y-0.npy
tf_bind_8-ARX_P353R_R1-y-0.npy
tf_bind_8-ARX_P353R_R2-y-0.npy
tf_bind_8-ARX_R332H_R1-y-0.npy
tf_bind_8-ARX_R332H_R2-y-0.npy
tf_bind_8-ARX_REF_R1-y-0.npy
tf_bind_8-ARX_REF_R2-y-0.npy

Does this answer your question?

Thanks, Chris

sqhang commented 3 months ago

Hi Chris,

Thanks a lot for your detailed answer!

I believe you answered my question very well! I have confirmed the data here for TF SIX6_REF_R1 is mainly what I need. As you suggested, I have also found around 200 .npy files for binding affinities of other TFs in Brandon's upload. One quick follow-up question,

  1. Do those 200 tf_bind_8_{TF}_y-0.npy files all correspond to the same default x file? Namely, do they correspond to the same list of 8-mers as in TF SIX6_REF_R1?

Thanks a lot for your help!