molybdenum-99 / reality

Comprehensive data proxy to knowledge about real world
MIT License
817 stars 43 forks source link

Entity caching #43

Open zverok opened 8 years ago

zverok commented 8 years ago

First take: eternal cache with manual cleanup.

  1. Cache is opt-in (should be explicitly turned on through config or Reality.cache!)
  2. Cache path should be configured (with reasonable default somewhere in /tmp?..)
  3. Each entity is cached by name, in two separate file: wikipedia data & wikidata data (we cache source, not parsed entity, because after adding new parsers richer data can be extracted from same cached source; drawback is parsing speed)
  4. There should be way to query if entity is from cache or from the wild (or, alternatively, how old is it)
  5. Dictionaries (lists) caching.

In future, we should consider: