Weijun-H / Read-Some-Paper

This repo is a reading list related to modern data management system.
0 stars 0 forks source link

LISA: A Learned Index Structure for Spatial Data #6

Open Weijun-H opened 1 year ago

Weijun-H commented 1 year ago

Abstract In spatial query processing, the popular index R-tree may incur large storage consumption and high IO cost. Inspired by the recent learned index [17] that replaces B-tree with machine learning models, we study an analogy problem for spatial data. We propose a novel Learned Index structure for Spatial dAta (LISA for short). Its core idea is to use machine learning models, through several steps, to generate searchable data layout in disk pages for an arbitrary spatial dataset. In particular, LISA consists of a mapping function that maps spatial keys (points) into 1-dimensional mapped values, a learned shard prediction function that partitions the mapped space into shards, and a series of local models that organize shards into pages. Based on LISA, a range query algorithm is designed, followed by a lattice regression model that enables us to convert a KNN query to range queries. Algorithms are also designed for LISA to handle data up- dates. Extensive experiments demonstrate that LISA clearly outperforms R-tree and other alternatives in terms of storage consumption and IO cost for queries. Moreover, LISA can handle data insertions and deletions efficiently.