[Performance] Using x2 to x3 more cache memory and bandwidth than apollo-datasource-rest

StarpTech / apollo-datasource-http

Optimized JSON HTTP Data Source for Apollo Server

MIT License

73 stars 32 forks source link

[Performance] Using x2 to x3 more cache memory and bandwidth than apollo-datasource-rest #42

Closed kdybicz closed 2 years ago

kdybicz commented 2 years ago

Looks like due to really simple approach in implementation of the maxTtlIfError in the http-data-source.ts the cached data are sent over-the-wire and stored in the cache twice.

On this graphs you can see that this implementation (plus probably some other less optimised code - than in apollo-server-rest) is causing a huge, unexpected influx of data in our Redis instance.

Additionally, it looks like we're not able to stop using the maxTtlIfError functionality, due to the way it option has been defined and used in code.

StarpTech commented 2 years ago

This is to be expected because we make use of two caches with different lifetimes. If you don't need it feel free to create a PR to make this optional.

kdybicz commented 2 years ago

That's the simplest implementation possible. I believe there might be better way of addressing the issue than simply on/off switch for the second key. I would be curious to see if anyone could come up with more optimised caching strategy.

StarpTech commented 2 years ago

Is there anything left here? You can create a discussion here https://github.com/StarpTech/apollo-datasource-http/discussions

kdybicz commented 2 years ago

Creating a discussion might be good idea, but I think it's worth adding a "disclaimer" informing devs that this implementation will double the amount of memory used for cache, as this is not that obvious atm.

kdybicz commented 2 years ago

Discussion created: https://github.com/StarpTech/apollo-datasource-http/discussions/44

StarpTech commented 2 years ago

This highly depends on the use case. I preferred high availability over space efficiency.

kdybicz commented 2 years ago

But to do that you need to start throwing more and more money to keep buy of the tier AWS machines for cache and crazy bandwidth, which ie. in our case is less than ideal :)