lancachenet / monolithic

A monolithic lancache service capable of caching all CDNs in a single instance
https://hub.docker.com/r/lancachenet/monolithic
Other
737 stars 73 forks source link

Hit ratio inflated due to slice #40

Open ilumos opened 5 years ago

ilumos commented 5 years ago

Hello friends!

Using nginx's slice module means that:

the first slice is fetched in the main request, while all other slices are fetched in subrequests. Since you log only the main request (by default), the variable $upstream_cache_status returns cache status of the first slice, which is likely to be cached. arut - https://trac.nginx.org/nginx/ticket/1200

This paints a rosy picture when logging via lancachenet/logstash and visualising in Kibana, showing hit ratios that are much better than they really are.

Simply enabling logging of subrequests with log_subrequest on would mean counting request bytes twice, once for the main request for e.g. 5MB, and then again for the 5x 1MB subrequests, throwing the graphs off another way.

Nginx doesn't have an equivalent to Apache's IS_SUBREQ variable, so some other way of differentiating main request log lines from subrequest lines is needed, potentially adding a header to one or other request and logging that, as discussed with @GotenXiao at UK LAN Techs hack weekend 2.0.

Once we can differentiate between log lines, in Kibana the graphs will need to be updated so that:

... and many of the other graphs too, to avoid counting bytes twice as described above.

Thanks!