FreshAirTonight / af2complex

Predicting direct protein-protein interactions with AlphaFold deep learning neural network models.
146 stars 19 forks source link

reduced_dbs #15

Open px172 opened 1 year ago

px172 commented 1 year ago

Hi, In example/run_fea_gen.sh, the db_preset uses reduced_db. Is there any performance difference between "reduced_dbs", "full_dbs", and "uniProt"? Which one is suggested?

FreshAirTonight commented 1 year ago

reduced_dbs and full_dbs are the same as defined in the official AF2. The difference is that the reduced_dbs uses a reduced BFD library versus full size of BFD to build input MSA. The uniprot option searches only the UniProt sequence database, and ignores BFD and mgnify. If you have an exotic target sequence which have very few (<100) homologous hits in the UniProt, it is a good idea to search as many sequence lib as possible. For a target sequence which already has thousands of homologous sequence in the UniProt, the uniprot option is sufficient and much more efficient.