moltimate / moltimate-backend

A protein active site alignment tool
GNU General Public License v2.0
10 stars 6 forks source link

Faster align #126

Closed jmiller656 closed 5 years ago

jmiller656 commented 5 years ago

Improve alignment speed by using biojava's structureIO API, which caches structures locally

Do not merge until #125 is merged

blackpan2 commented 5 years ago

This caches structures instead of always getting them from the PDB. This would only help us in the case of repeatedly getting a structure from the PDB, which we are caching?

I like the other changes just checking in

jmiller656 commented 5 years ago

It essentially saves us ~1000 network calls per alignment, which helps for a ton of reasons. Keeps the structures locally

blackpan2 commented 5 years ago

But wait why is it saving us so many calls? Where are we getting the structure multiple times?

jmiller656 commented 5 years ago

It happens every time we call ProteinUtils.queryPDB. This gets called for every single motif every time when we do alignments: https://github.com/moltimate/moltimate-backend/blob/40156b89489310e281eeb98349d71bad736b4439/src/main/java/org/moltimate/moltimatebackend/service/AlignmentService.java#L143

blackpan2 commented 5 years ago

Ok yes, I see that now. Do we have any way to do settings on the caching for structures that this does? We reference a lot of different proteins and this could drive memory usage way up.

jmiller656 commented 5 years ago

Ok yes, I see that now. Do we have any way to do settings on the caching for structures that this does? We reference a lot of different proteins and this could drive memory usage way up.

According to biojava, the cache is stored in the local filesystem. So it does not use memory