30350n / inventree_part_import

CLI to import parts from suppliers like DigiKey, LCSC, Mouser, etc. to InvenTree
MIT License
24 stars 8 forks source link

A disk-based cache for all provider queries to prevent provider API throttling #53

Open randrej opened 1 month ago

randrej commented 1 month ago

If you play around with settings and importing parts too much, you might get yourself throttled or blocked by the providers. Adding caching would prevent you from repeating the same queries to the providers, saving you from this fate.

Might make sense to use something like diskcache (sqlite-based) to cache all provider queries based on the arguments, with a configurable TTL (say 6h for a default).

This would imply that if you have caching enabled, whenever you search for a part or similar, the result is cached. If you search for it again in the next n hours, you'll just get the cached response and you won't hit the API again, lessening the chance of getting yourself throttled.

Might make sense to add a CLI arg for avoiding cache, and a subcommand or flag for clearing it.

randrej commented 1 month ago

Btw I'd like to help on things like this, I'm not just asking for you to do something, but you gotta approve it first.

30350n commented 1 month ago

I've thought about this a while ago, but ultimately decided it's not worth the effort really. With normal use (i.e. not during initial setup/testing) duplicate request shouldn't really happen between runs. So I'm already caching requests during runtime, which imo is sufficient.

you might get yourself throttled or blocked by the providers

This only applies to suppliers that I use crawling for (Mouser, LCSC, reichelt), from those Mouser currently just doesn't work at all, because they hardened their protections (see #49) and I haven't had many problems with blocking from either LCSC or reichelt.