What is the context of the feature?
Generate a file of all proteins & protein complexes in the dataset.
Describe the solution
Call the Ensembl API to find lengths for each protein, then get the lengths of each complex, and sort the proteins/complexes shortest to longest. This will ensure that the shortest proteins are ran first through ColabFold, so they aren't stuck waiting behind longer sequences in the job queue.
What is the context of the feature? Generate a file of all proteins & protein complexes in the dataset.
Describe the solution Call the Ensembl API to find lengths for each protein, then get the lengths of each complex, and sort the proteins/complexes shortest to longest. This will ensure that the shortest proteins are ran first through ColabFold, so they aren't stuck waiting behind longer sequences in the job queue.