Another way to avoid overcounting would be to use the buffer for updating the linearization point without updating mu or sigma. For example, at the beginning of step t we have belief state (mu{t|t-1}, Upsilon{t|t-1}, W{t|t-1}) and linearized model \hat{h}{t} based at mu{t-1}. We run lofi as normal through all items in \data{t-b:t} (a total of b+1 predict-then-update steps), yielding new belief state (mu, Upsilon, W). Then we throw out Upsilon and W and define a new linearized model \hat{h} based at mu. Finally we do a single update step from (mu{t|t-1}, Upsilon{t|t-1}, W_{t|t-1}) using \hat{h} and \data_t.
Another way to avoid overcounting would be to use the buffer for updating the linearization point without updating mu or sigma. For example, at the beginning of step t we have belief state (mu{t|t-1}, Upsilon{t|t-1}, W{t|t-1}) and linearized model \hat{h}{t} based at mu{t-1}. We run lofi as normal through all items in \data{t-b:t} (a total of b+1 predict-then-update steps), yielding new belief state (mu, Upsilon, W). Then we throw out Upsilon and W and define a new linearized model \hat{h} based at mu. Finally we do a single update step from (mu{t|t-1}, Upsilon{t|t-1}, W_{t|t-1}) using \hat{h} and \data_t.
cf Á. F. García-Fernández, L. Svensson, and S. Särkkä, “Iterated Posterior Linearization Smoother,” IEEE Trans. Automat. Contr., vol. 62, no. 4, pp. 2056–2063, Apr. 2017, doi: 10.1109/TAC.2016.2592681. [Online]. Available: https://web.archive.org/web/20200506190022id_/https://research.chalmers.se/publication/249335/file/249335_Fulltext.pdf