openforcefield / openff-qcsubmit

Automated tools for submitting molecules to QCFractal
https://openff-qcsubmit.readthedocs.io/en/latest/index.html
MIT License
26 stars 4 forks source link

Use `get_records_with_cache` to cache `to_records` calls #286

Open ntBre opened 3 weeks ago

ntBre commented 3 weeks ago

Description

This PR uses the get_records_with_cache function discussed in our last QCArchive meeting to cache to_records calls automatically. With these changes alone, users cannot actually access this behavior, but when combined with #284, it enables code like this:

from qcportal import PortalClient

from openff.qcsubmit._tests.utils.test_manager import no_internet
from openff.qcsubmit.results import OptimizationResultCollection
from openff.qcsubmit.utils.utils import portal_client_manager

opt = OptimizationResultCollection.parse_file("tiny-opt-dataset.json")
client = PortalClient("https://api.qcarchive.molssi.org:443", cache_dir=".")

with portal_client_manager(lambda _: client):
    print(len(opt.to_records()))
    with no_internet():
        print(len(opt.to_records()))

For this PR, I applied the changes separately from #284 to the main branch, but I have used this code to test the combined changes locally, and it works! I have not run into MolSSI/QCFractal#844 here yet, so hopefully that is less common in real use cases.

Todos

Status

ntBre commented 2 weeks ago

After thinking more about this, possibly a better solution would be for us (or actually a user) to subclass PortalClient and override the various get_{singlepoints,optimizations,torsiondrives} methods with cached versions. That would avoid having to add the record_cache field to the standard PortalClient, which felt weird to me, and revert all of these changes because they would be handled by providing this new CachedPortalClient. I'll try this approach tomorrow.

ntBre commented 2 weeks ago

Here's an example of the PortalClient subclass. Unfortunately, I also found out that the no_internet trick is not infallible. I think the fact that the PortalClient opens a requests.Session and holds onto it means that it can access the internet without going back through socket.socket. However, I monitored internet traffic with Wireshark and made sure that this code does not access the internet on the second to_records call, which is also verified by setting _req_session to None.

import os
import shutil

from qcportal import PortalClient
from qcportal.cache import RecordCache, get_records_with_cache
from qcportal.optimization import OptimizationRecord

from openff.qcsubmit._tests.utils.test_manager import no_internet
from openff.qcsubmit.results import OptimizationResultCollection
from openff.qcsubmit.utils.utils import portal_client_manager

class CachedPortalClient(PortalClient):
    def __init__(self, addr, cache_dir, **client_kwargs):
        super().__init__(addr, cache_dir=cache_dir, **client_kwargs)
        self.record_cache = RecordCache(
            os.path.join(self.cache.cache_dir, "cache.sqlite"), read_only=False
        )

    def get_optimizations(self, record_ids, missing_ok=False, include=None):
        return get_records_with_cache(
            self,
            self.record_cache,
            OptimizationRecord,
            record_ids,
            include=["initial_molecule", "final_molecule"],
        )

if os.path.exists("api.qcarchive.molssi.org_443"):
    shutil.rmtree("api.qcarchive.molssi.org_443")

opt = OptimizationResultCollection.parse_file("tiny-opt-dataset.json")
client = CachedPortalClient("https://api.qcarchive.molssi.org:443", cache_dir=".")

with portal_client_manager(lambda _: client):
    client._req_session = None
    with no_internet():
        print(len(opt.to_records()))

This code runs on the branch for #284 without any changes to qcsubmit. The implementation of CachedPortalClient basically needs to know which methods we call on PortalClient internally, though, so I think it might make sense for us to provide this class even though a user could define it.

ntBre commented 1 week ago

I've updated the implementation to use the CachedPortalClient from the code block above. We discussed making the class private, but I left it public for now because I think it might be useful for anyone wanting to use additional PortalClient kwargs with the portal_client_manager. I'm happy to make it private if you prefer, though. That's a pretty advanced use case and maybe not one we want to support in the public API.

I have not added any tests specifically for the caching yet, but I think it's a great first sign that the tests pass after replacing the default PortalClient with this cached version. I also tried intentionally introducing errors in each of the new methods to make sure they were actually being called, so they are all covered by existing tests at least.

One issue I've run into with testing this is that each test uses the same default cache_dir. I tried adding a pytest fixture to delete the cache dir before each test ran, but this caused disastrous issues with pytest-xdist running multiple tests in parallel (one would be trying to write to the cache while the next test tried deleting the dir). I think this means that fully testing the caches will be more involved than I hoped. It also means that I currently have to delete the cache dir locally between each test run, or the tests just use the cache. That won't be an issue in CI at least. My best idea for fixing this currently is to modify the tests using PortalClients more invasively to wrap the code in portal_client_managers with temporary cache dirs so that each test can run fully independently and clean itself up afterward.

ntBre commented 6 days ago

The tests should now be passing with temporary cache directories everywhere CachedPortalClient is used. In my local tests I was seeing ignored instances of the error from MolSSI/QCFractal#844, but it sounds like that has been fixed in the next version (0.56) of qcportal.

Now, I think I just need to update these tests to check that the caching actually worked.

ntBre commented 5 days ago

The tests have been updated, and I copied over (and modified) documentation from qcportal. It should be ready for review!