Closed shawncal closed 2 years ago
Did you find structures were as good without searching BFD? AlphaFold does the same as before running against both databases at the same time, maybe can adjust and run similar as to Uniref first, then BFD. I guess they don't mind if they have way more than 2000 proteins at >=75% coverage??
Sequencing the MSA search, first looking through UniRef (x4 times), then expanding to bfd (x4 times). Since these database files are very large, this improves the chance that we can keep the UniRef db in cache rather than loading (or steaming, if pulling from a network resource) 4 separate times.
For sequences that are well-represented in the UniRef, we may never need to load/search bfd, which will speed things up significantly.
Other, minor changes: