fazalmajid / temboz

The Temboz RSS/Atom feed reader
MIT License
82 stars 4 forks source link

add time dimension to SNR #42

Open fazalmajid opened 11 years ago

fazalmajid commented 11 years ago

The SignalNoiseRatio report and sorting do not currently take time into consideration. This is problematic, as feeds can lose relevance over time. Implementing something like an exponentially decaying SNR (like that used to compute UNIX load averages) would improve the usefulness of this feature without unduly penalizing high-quality, ultra-low-volume feeds the way an arbitrary cut-off to the last N days would.

fazalmajid commented 11 years ago

2006-Jun-13 22:45:38 by majid:

The implementation in [324] is problematic: 1: It requires a custom aggregate function implemented in Python, and thus the views will not work when run from qithin the sqlite3 CLI. 1: The aggregate function is recalculated each time the query is run, and is very computationally expensive (specially since Python functions have to be called, not the native C exp function).

To compute the SNR in a reasonable fashion, the exponentially decaying sums need to be denormalized and precomputed feed per feed (and updated via a trigger on the items table).

There is a way to approximate exponential decay while remaining compatible with base SQLite functionality, by using scaled integer arithmetic and SQLite's built-in bit-shift operators << and >>, which will accept floating point numbers as the right operand. We can do this because:

2-a-0.1*bx = 2-a(2-0.1)bx ~= (x >> a)(1 - (1>>4) - (1>>8))b ~= (x>>a) - (bx>>(a+4))

The fit of this crude approximation is shown in the figure below:

{image: approx.gif}


2006-Jun-14 00:04:03 by majid:

Or a much simpler solution: just create a lookup table for the function, to one or two decimals...