Note: Datalight has published a technical bulletin (follow the link to download) which discusses the consequences of this bug, which projects are likely to be affected, and what has been done to correct it.
Reliance Edge is using a global variable for the "current" (i.e., currently accessed) volume in several places where doing so is inappropriate and problematic.
In core/driver/blockio.c, every function is using gpRedVolConf (the global variable for the per-volume configuration parameters for the current volume) to access the number of times that an I/O operation should be retried:
All of the RedIo functions have a bVolNum (volume number) parameter; and so, semantically speaking, they should use the parameters for the specified volume rather than the current volume. The issue is more than semantical with RedIoWrite(). The block buffers in Reliance Edge (core/driver/buffer.c) are shared by all mounted Reliance Edge volumes; and when doing LRU replacement of a dirty block buffer, that module will invoke RedIoWrite() (from BufferWrite()) on a buffer that is not necessarily from the current volume. As a result, the bBlockIoRetries and ullSectorOffset values used may be for the wrong volume. If partitioning is in use, using the wrong ullSectorOffset will generally result in a I/O error, since (as of v2.2.1) the RedOsBDevWrite() implementations use the VOLUME_SECTOR_RANGE_IS_VALID() to validate the sector range, which will detect the discrepancy.
In core/driver/buffer.c, the BufferFinalize() function is using gpRedVolume (the global variable for volume data for the current volume) to get at the current sequence number:
uint64_t ullSeqNum = gpRedVolume->ullSequence;
BufferFinalize() is called when doing LRU replacement of dirty block buffer; as mentioned above, the block being written is not necessarily from the current volume. If the block buffer being written is not for the current volume, that means the incorrect sequence number is being put into the metadata block header. This could result in metadata corruption. The sequence number from the current volume might be higher than the sequence number for the volume whose metadata block is being written. When that metadata block is read back in, Reliance Edge detects (in BufferIsValid()) that the sequence number in the metadata block header is too high for the volume: resulting in a critical error that sets the volume read-only.
BufferFinalize() also calls RedVolSeqNumIncrement(), which always increments the sequence number from the current volume: it should be incrementing the sequence number for the volume whose metadata block is being written.
Note: Datalight has published a technical bulletin (follow the link to download) which discusses the consequences of this bug, which projects are likely to be affected, and what has been done to correct it.
Reliance Edge is using a global variable for the "current" (i.e., currently accessed) volume in several places where doing so is inappropriate and problematic.
In core/driver/blockio.c, every function is using
gpRedVolConf
(the global variable for the per-volume configuration parameters for the current volume) to access the number of times that an I/O operation should be retried:Also in core/driver/blockio.c,
RedIoRead()
andRedIoWrite()
are also usinggpRedVolConf
for the sector offset:All of the
RedIo
functions have abVolNum
(volume number) parameter; and so, semantically speaking, they should use the parameters for the specified volume rather than the current volume. The issue is more than semantical withRedIoWrite()
. The block buffers in Reliance Edge (core/driver/buffer.c) are shared by all mounted Reliance Edge volumes; and when doing LRU replacement of a dirty block buffer, that module will invokeRedIoWrite()
(fromBufferWrite()
) on a buffer that is not necessarily from the current volume. As a result, thebBlockIoRetries
andullSectorOffset
values used may be for the wrong volume. If partitioning is in use, using the wrongullSectorOffset
will generally result in a I/O error, since (as of v2.2.1) theRedOsBDevWrite()
implementations use theVOLUME_SECTOR_RANGE_IS_VALID()
to validate the sector range, which will detect the discrepancy.In core/driver/buffer.c, the
BufferFinalize()
function is usinggpRedVolume
(the global variable for volume data for the current volume) to get at the current sequence number:BufferFinalize()
is called when doing LRU replacement of dirty block buffer; as mentioned above, the block being written is not necessarily from the current volume. If the block buffer being written is not for the current volume, that means the incorrect sequence number is being put into the metadata block header. This could result in metadata corruption. The sequence number from the current volume might be higher than the sequence number for the volume whose metadata block is being written. When that metadata block is read back in, Reliance Edge detects (inBufferIsValid()
) that the sequence number in the metadata block header is too high for the volume: resulting in a critical error that sets the volume read-only.BufferFinalize()
also callsRedVolSeqNumIncrement()
, which always increments the sequence number from the current volume: it should be incrementing the sequence number for the volume whose metadata block is being written.For further details, refer to the technical bulletin.