blevesearch / zapx

Zap file format compatible with a future version of Bleve
Apache License 2.0
11 stars 12 forks source link

MB-62427: seeking the pointer correctly while reading segmentBase data #252

Closed Thejas-bhat closed 2 months ago

Thejas-bhat commented 2 months ago

v16 file format introduced the concept of sections in which as part of metadata the fieldsIndex part was replaced by a sectionsIndex which serves the same purpose of fieldsIndex of pointing to a file offset where the actual data of field's section is stored.

The PR tries to resolve the situation - where there was only one field's data being written out while creating the in-memory segmentBase and the process was crashing while initializing the fields related information in the loadFieldsNew() API at https://github.com/blevesearch/zapx/blob/0c7027f136973ae50dd8c86026c557e0a9cfbcf3/segment.go#L323

The reason for the crash is while writing out the data, because there is only one field metadata being written out at the very end of the buffer https://github.com/blevesearch/zapx/blob/0c7027f136973ae50dd8c86026c557e0a9cfbcf3/write.go#L89 is worth 9 bytes. However we were doing an out-of-bounds access over here https://github.com/blevesearch/zapx/blob/0c7027f136973ae50dd8c86026c557e0a9cfbcf3/segment.go#L323 because binary.MaxVarintLen64 = 10.

The solution proposed here is to guard this kind of access by keeping in mind the length of the backing buffer like how it was done in v15 https://github.com/blevesearch/zapx/blob/a72794cd37c5c992f48571b9c6267759cbc5ac33/segment.go#L268