Closed primoz-k closed 8 years ago
Right now, frozendict
assumes that all the values of original dictionary are hashable.
Let me give a little background on why we have frozendict
at all.
When dedupe learns blocking rules, it keeps track of pairs of records that a blocking rule covers. This is done through building sets of pairs of records. In order to use python sets, the objects must be hashable, and so the records must be hashable.
It's possible to have a different design and keep track of hashable ids that refer to the records. I've tried this a few times, and it added a lot of complexity to the design.
Okay, so there are three ways forward
If you wanted to work on this, I would say the second option is probably the best.
Perfect. In the meantime I have already modified the hash
method which now deals with lists and dictionaries so that they are now casted into tuples.
I am just glad I went into the right direction and will submit PR after I write this method a bit more bulletproof if you want.
I'd like to see a PR, for sure. Probably needs a recursive design.
I finally excised the necessity for the records to be hashable, c4c67bba25c3f53d0668cf13016a32df38c0c10c
I'm retrieving my rows from a Postgresql DB and one of the retrieved columns is a
jsonb
array. When labeling is completed, I get:TypeError: unhashable type: 'list'
in the__hash__()
method of afrozendict
.Example of
self._d
:I've tried converting the list to tuple, but then this can happen:
The problem, as you can see, is with
contactpositions
. Are these types not supported and is there any way I can still use them?