We're implementing caching of sharding functions in the scope of #82.
There is the difference how we cache a sharding function, which is defined by name and defined with full body. The latter case is the good one: we can track changes of the function with the on_replace trigger, so we just store a callable object in the cache. The former one (stored name) is tricky. One can replace implementation of the function (just rawset(_G, 'my_sharding_func', <...>)) and the caching code will not be aware of this. So we store only name of the function.
As result some extra code is working each time to get the callable. It traverses the list of name chunks like {'vshard', 'router', 'bucket_id_mpcrc32'} and returns _G.vshard.router.bucket_id_mpcrc32. In measurements it shows 1.5x performance drop, see PR #85 for numbers.
I see two ways to proceed here:
Look deeper at the difference: maybe we can optimize the traversal code and make the difference negligible.
Implement the exceptions list: function names, which are known to be safe for caching (because they're never changing). Add vshard functions (vshard.router.bucket_id_mpcrc32, vshard.router.bucket_id_strcrc32, vshard.router.bucket_id) here by default and leave a user ability to manage the list.
Ignore the difference. It is tiny in absolute numbers: 550ns vs 380ns (according to numbers in PR #85).
We're implementing caching of sharding functions in the scope of #82.
There is the difference how we cache a sharding function, which is defined by name and defined with full body. The latter case is the good one: we can track changes of the function with the
on_replace
trigger, so we just store a callable object in the cache. The former one (stored name) is tricky. One can replace implementation of the function (justrawset(_G, 'my_sharding_func', <...>)
) and the caching code will not be aware of this. So we store only name of the function.As result some extra code is working each time to get the callable. It traverses the list of name chunks like
{'vshard', 'router', 'bucket_id_mpcrc32'}
and returns_G.vshard.router.bucket_id_mpcrc32
. In measurements it shows 1.5x performance drop, see PR #85 for numbers.I see two ways to proceed here:
vshard.router.bucket_id_mpcrc32
,vshard.router.bucket_id_strcrc32
,vshard.router.bucket_id
) here by default and leave a user ability to manage the list.