Open PureWhiteWu opened 1 year ago
In our previous usage, we are using it like this way:
pub fn decode(&self, mut data: Bytes) -> Result<Bytes> { self.detect(&mut data)?; let rd = data.reader(); let mut rd: Box<dyn Read> = match self { CodecType::Gzip => Box::new(GzDecoder::new(rd)), CodecType::Snappy => Box::new(FrameDecoder::new(rd)), CodecType::Noop => Box::new(rd), }; let mut buf = Vec::new(); rd.read_to_end(&mut buf)?; Ok(buf.into()) }
And we found that there's a lot of memory allocation in the flamegraph which uses about 25% of cpu:
After I added the reset API and uses it like this way:
thread_local! { static SNAP_DECODER: RefCell<FrameDecoder<Reader<Bytes>>> = RefCell::new(FrameDecoder::new(Bytes::new().reader())); static MAX_SIZE: RefCell<usize> = RefCell::new(4096); } pub fn decode(&self, mut data: Bytes) -> Result<Vec<u8>> { self.detect(&mut data)?; let rd = data.reader(); SNAP_DECODER.with(|s| { let mut snap_decoder = s.borrow_mut(); let mut rd: Box<dyn Read> = match self { Self::Gzip => Box::new(GzDecoder::new(rd)), Self::Snappy => { snap_decoder.reset(rd); Box::new(&mut *snap_decoder) } Self::Noop => Box::new(rd), }; MAX_SIZE.with(|s| { let mut size = s.borrow_mut(); let mut buf = Vec::with_capacity(*size); rd.read_to_end(&mut buf)?; if buf.len() > *size { *size = buf.len(); } Ok(buf) }) }) }
The CPU usage of this part is greatly reduced and can even be ignored:
r? @BurntSushi
ping~ @BurntSushi
Kindly remind if you missed this @BurntSushi
In our previous usage, we are using it like this way:
And we found that there's a lot of memory allocation in the flamegraph which uses about 25% of cpu:
After I added the reset API and uses it like this way:
The CPU usage of this part is greatly reduced and can even be ignored: