Closed lpawluczuk closed 7 years ago
I am having problems reviewing this PR. It has too many changes in a big commit and I don't really see the reason of many of the changes (not only the ones I commented, but also many of the rest). I think they are complicating unnecessarily the code. I was thinking on something much more simple. I think we don't need to change so many things to achieve the same.
If you create a new PR with only one commit which only adds the reservoir sampling class (and another that adds tests for the isolated class, it should be possible), I can do the changes needed in the exporters for fitting on it. It should be possible to do things much more simple.
don't remove this PR, I will use it for reference
I will proceed in the inverse way. I will provide the basis for adding custom write buffers and then you create a PR for the reservoir sampling.
This PR is old and has been replaced by other ones
This is implementation of reservoir sampling for stream of items on writers level.
It addresses this issue: https://github.com/scrapinghub/exporters/issues/317