Get feedback on amount of unflushed data in buffers

After researching this in more depth I've decided to not implement this because it is too complicated. I'm leaving the results of what I found here in case I may come back to it later, or if someone else wants to look into it.

For LZMA you can inspect the following variables to determine (a) how much buffered input has been read from the input source but not been processed and (b) how much buffered output has not been written to the output sink:

// mEncoder is `Master.LZMA.CLzmaEnc` - for example the member variable in `AsyncEncoder`

// the number of (unprocessed) input bytes in the LZMA encoder
var pMatchFinder = mEncoder.mMatchFinderBase;
var pMatchFinderCachedBytes = pMatchFinder.mStreamPos - pMatchFinder.mPos + fetched;

// the number of (unflushed) output bytes in the LZMA encoder
var pRangeCoder = mEncoder.mRC;
var pRangeCoderCachedBytes = pRangeCoder.mBuf - pRangeCoder.mBufBase;

At this level the problem is primarily that the variables are not synchronized. It may be possible to turn all writes against them to volatile writes, but it should be measured if volatile writes have negative performance impact when the feature is not used (i.e. when there are no volatile reads). If so, then it shouldn't be done. Reading the variables without synchronization is of course always a last resort option.

Above research only is for the LZMA encoder. It can probably be extended to LZMA2 easily, but I have not tried. The real problem is extending it to 7z encoders, which is what the actual use case of this feature required. 7z encoders can be configured in arbitrary graphs and there is no obvious way to how report and interpret buffers between nodes.

weltkante / managed-lzma

Get feedback on amount of unflushed data in buffers #20