Closed rewritten closed 2 years ago
@msiemens do you think this solves the mapping problem? Do you want me to further improve test coverage / documentation / anything?
Thanks @rewritten! This is an elegant implementation, I like it!
Maybe it would be even better to avoid an extra method altogether and use
__getitem__
:
I think it would be more consistent with how other parts of the Query
interface work to use the .map()
function that you've already implemented π
There's another question: How do we make sure that lambda
s passed to .map()
have a consistent hash? For example:
>>> hash(lambda: 1)
275762436
>>> hash(lambda: 1)
275850424
This would mean that a query that uses map
would create multiple entries in the query cache for the same lambda
. A solution would be to build the hash from the lambda
s code:
>>> hash((lambda: 1).__code__)
6824985394236043508
>>> hash((lambda: 1).__code__)
6824985394236043508
But it gets even more complicated: lambda
s can references local variables which are not included in the hash. Thus the query results would change but the query's hash would not and the query cache would return old results:
>>> x = 2
>>> hash((lambda: x).__code__)
581326866784499933
>>> x = 20
>>> hash((lambda: x).__code__)
581326866784499933
This is something I don't have a solution for except to somehow exclude queries that use map
from the query cacheβ¦
Interesting. I probably would not worry about not being able to reuse a cached lambda for a query with the "same" lambda. On the contrary, rightly because even the same callable might have different effects depending on the state of the universe, I would explicitly avoid caching any callable at all.
x = 1
fn = lambda y: y + x
q = Query().val.map(fn) == 4
# q will return items with val==3
x = 2
# now q will return other items
I'll look into the caching, but I'm quite sure that you don't want to cache the result of callable-based queries.
Hey @msiemens I have given a try to mark queries as uncacheable when they contain a .map(callable)
part, but it is not ideal, because:
This last commit contains the needed changes, but I find it a bit inelegant. If you like it, I'm not opposed to merging, but I feel that there could be a better solution.
Hey @msiemens have you had an opportunity to see if this code fits your need?
@rewritten I will have a closer look at your solution in the next two weeks, I'm just a little busy at the moment with the Christmas season π I'll check if I find a more elegant solution, but I also think that a pragmatic solution absolutely has its merits!
Thanks for your work! It's much appreciated π
Regarding checking if the query is cachable or not: I think this implementation is fine. After all, explicit is better than implicit and all that π
I'll try to get a new version of TinyDB ready and published this weekend which will include your changes!
This is now released in TinyDB v4.6.0 π₯³
Attempt to solve #423. The idea comes from lenses in general, and more specifically from Elixir's
Access
module and its kernel companion functionsget_in
and friends.Maybe it would be even better to avoid an extra method altogether and use
__getitem__
:WDYT?