o0111 / ruralcafe

Automatically exported from code.google.com/p/ruralcafe
0 stars 0 forks source link

Larger granularity caching #41

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Currently each item (image, script, HTML-page) is considered independent. The 
cache size can be limited and then an LRU strategy is applied to evict items 
from the cache.

This can destroy website's appearances, if embedded objects get deleted but the 
website doesn't. Or, if the situation is vice versa, space is being wasted 
until the embedded objects get deleted, too.

We would need a new database table for websites, with a *-to-* association (one 
website can have 0 to several embedded objects (this should include the HTML 
page here), one objects is embedded into at least 1 page).

Then, when evicting, whole webistes should get deleted, but only the embedded 
objects that are not embedded on another page currently.

For packages from the remote proxy it is easy to see, which items are embedded, 
as all items in the package are embedded into the page. But for streaming or 
when downloading at the remote side, this is rather difficult.

Original issue reported on code.google.com by satiaher...@gmx.de on 24 Sep 2013 at 9:23