The root cause of the issue is that multiple HiveWriters use the same DrillBuf and during execution they may reallocate the buffer if size of the buffer is not enough for a value (256 bytes+). Since drillBuf.reallocIfNeeded(int size) returns a new instance of DrillBuf, all other writers still have a reference for the old one buffer, which after drillBuf.reallocIfNeeded(int size) execution is unmanaged now.
Description
HiveValueWriterFactory now creates a unique DrillBif for each writer.
HiveWriters are actually used one-by-one and we could utilize a single buffer for all the writers. To do this, I could create a class holder for DrillBuf, so each writer has a reference for the same holder, where will be stored a new buffer from every drillBuf.reallocIfNeeded(int size) call. But I thought that such logic looked slightly confusing and I decided just to let each HiveWriter use its own buffer.
Documentation
-
Testing
Add a new unit test to query a Hive table with variable-length values of Binary, VarChar, Char and String types.
DRILL-8495: Tried to remove unmanaged buffer
The root cause of the issue is that multiple HiveWriters use the same
DrillBuf
and during execution they may reallocate the buffer if size of the buffer is not enough for a value (256 bytes+). SincedrillBuf.reallocIfNeeded(int size)
returns a new instance ofDrillBuf
, all other writers still have a reference for the old one buffer, which afterdrillBuf.reallocIfNeeded(int size)
execution is unmanaged now.Description
HiveValueWriterFactory
now creates a uniqueDrillBif
for each writer.HiveWriters are actually used one-by-one and we could utilize a single buffer for all the writers. To do this, I could create a class holder for
DrillBuf
, so each writer has a reference for the same holder, where will be stored a new buffer from everydrillBuf.reallocIfNeeded(int size)
call. But I thought that such logic looked slightly confusing and I decided just to let each HiveWriter use its own buffer.Documentation
-
Testing
Add a new unit test to query a Hive table with variable-length values of Binary, VarChar, Char and String types.