pudo / dataset

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
https://dataset.readthedocs.org/
MIT License
4.78k stars 298 forks source link

Add optional callback to ChunkedInsert #314

Closed mynameisfiber closed 4 years ago

mynameisfiber commented 4 years ago

This commit adds an optional callback argument to the ChunkedInsert object. This callback is a callable object which gets called before the chunked insert happens. This is useful for clearing any local caches that may be in place to deal with the eventual consistency resulting from the delayed nature of the chunked inserts.

For example,

cache = set()
chunked_table = ChunkedInsert(table, callback=lambda queue: cache.clear())
while True:
    data = get_data_id()
    key = data['key']
    if key in cache or table.find_one(key=key)
        continue
    cache.add(key)
    chunked_table.insert(data)
pudo commented 4 years ago

I'm a bit worried this will lead to some pretty wonky control flows, but let's give it a shot :)