Implement a developer friendly caching for larger projects

oprypkhantc commented 4 months ago

Hey :)

So since the changes to loading classes, we can now use a broader namespaces (like App) instead of specifying only specific namespaces containing GraphQLite related classes. This means that if you do specify a namespace containing a lot of classes, class discovery using class-finder will take a long time, since it currently scans and reflects all classes in that namespaces.

In our case, it takes about 10sec just to discover all classes in our project. About 1/10th of that is discovering files, while 9/10th is autoloading (or include_once, doesn't make a huge difference) and reflection.

There is a setting called globTTL that sets an expiration on those globbed classes, but it's kind of useless:

if you set it to a high number (like 1 hour), then you force developers to manually clear the cache every time they make changes to the codebase
if you set it to a low number (like the default 2 for the dev mode), then classes are scanned on every request / every server start

This is not ideal, and there are things we can improve upon.

First, right now there are separate Type and Controller namespaces, and completely separate code that handles them. Meaning if you specify the same namespace for both of these, GraphQLite will scan classes in the same namespace twice. Instead of doing that, we can combine these into the same thing (just addNamespace() instead of addTypesNamespace() and addControllersNamespace()) and also make sure both of these use the same piece of code to discover classes, with the same cache. Doing so will simplify the setup and cut the scan time in half for a use case like ours, while not affecting other use cases at all (since the classes are cached in memory anyway)

Second, add an interface like GraphQLAttribute and make sure every annotation/attribute implements it. This gives us knowledge of whether a discovered class is even relevant to us at all or it has no relation to GraphQLite whatsoever.

Third, implement a custom iterator for FinderInterface that does two things:

first, it only returns classes related to GraphQLite by looking whether classes/methods have GraphQLite attributes
second, it uses filemtime() (file modification time) function to determine whether a class (or any of it's parents) has changed since the last scan

Basically this for the first scan (without cache):

scan all files
reflect all classes
write those related to GraphQLite to relatedClasses cache
write those unrelated to unrelatedFiles cache, with file modification time for each of them

Every subsequent scan (with cache):

scan all files
return and reflect only related classes from relatedClasses and those that changed since the last scan from unrelatedFiles
completely ignore files unrelated to GraphQLite or unchanged since the last scan

This way we'll only autoload/reflect/parse classes the first time and won't bother checking them again next times.

Point 1 is relatively trivial, but will require a deprecation. Point 2 is trivial. Point 3 is not that trivial. I understand that you might not want to support that, so as long as points 1-2 are implemented, I can do this one separately on our side :)

What do you think?

oprypkhantc commented 4 months ago

This is actually harder than I anticipated due to additional caching that AbstractTypeMapper does internally.

oojacoboo commented 4 months ago

I think this all sounds like a good performance improvement. I don't see any issues with a custom Finder really. It can be overridden anyway through the schema factory. What's the performance hit for filemtime though? Maybe that could only be called in dev and for prod you'd need to manually clear the cache.

oprypkhantc commented 4 months ago

The performance hit of filemtime is relatively negligible, but yes, for production I'd rather avoid even that.

I'm looking into how caching is implemented in AbstractTypeMapper and will report my progress :)

fogrye commented 4 months ago

@oprypkhantc I saw that AbstractTypeMapper is only one class which uses thecodingmachine/cache-utils which is same dead lib as class-explorer was. I would like to suggest removal of it while doing your cache improvements. It's blocker for psr/simple-cache and I saw people had issues with that.

oprypkhantc commented 4 months ago

@fogrye I saw that one too, but I actually thought it was quite useful. I'm refactoring the existing caching to use it :)

That said, I'm stuck using intera-doo/cache-utils's fork too :( If there's still no movement by the time I'm done, I guess I can just copy-paste the classes from cache-utils. Not sure there's more we can do.

thecodingmachine / graphqlite

Implement a developer friendly caching for larger projects #671