Open SeanR22 opened 3 years ago
In thinking about ways to improve the method to be able generate conformers for longer sequences I wonder if the chunk search method could be "trained" from runs of shorter sequences with the same sequence character? For example, if we created an idpconfgen_database.json file from low energy structures of runs on a particular sequence would it run more efficiently when searching using this trained database. How hard would it be to code for the ability to create a database from a folder of conformers generated by IDPconfgen?
Furthermore, if we have a database of low energy structures that were built using the chunk method, could we then scale the method up to add pieces as "fragments", which would be longer than 5 residues in length.
In conclusion, the chunk method could be used to create a representative data set for shorter sequences (Ie up to 150 amino acids) then longer sequences could be built in fragments from this representative data set.
Another way to create longer conformers would be to build on to low energy conformers generated for shorter sequences.
I have generated many conformers for a number of different sequences of different length. The following are graphs showing how the rate of conformer generation slows dramatically and the energies of the conformers explode logarithmically with increasing sequence length.