glottolog / pyglottolog

Python API to access glottolog/glottolog
https://glottolog.org
Apache License 2.0
28 stars 5 forks source link

__hash__ for Languoids #68

Closed antipodite closed 2 years ago

antipodite commented 2 years ago

I noticed that the hash magic method for Languoids is defined like so: def __hash__(self): return id(self) This means that some of the useful things you can do with sets, like intersection, union etc won't work with sets of Languoids, as the hash function is returning the memory address of the particular Languoid instance. Is this the desired behaviour, or could we change it to something like hashing a tuple of languoid attributes so set operations work with Languoids?

xrotwang commented 2 years ago

Hm. I think I did that on purpose - but I can't come up with a good reason now. I'll check what changing it to - say - the Glottocode (?) will do to the tests.

xrotwang commented 2 years ago

@antipodite just to make sure: You would want to have hash(self.id) - i.e. the Glottocode - as hash for Languoid?

antipodite commented 2 years ago

Yeah, something like that, although I thought it would be better to have it hashing some pair of immutable values so the hash value of the object isn’t identical to the hash value for the glottocode as a string, but maybe it doesn’t matter

From: Robert Forkel @.> Reply to: glottolog/pyglottolog @.> Date: Monday, 27. June 2022 at 15:00 To: glottolog/pyglottolog @.> Cc: Isaac @.>, Mention @.***> Subject: Re: [glottolog/pyglottolog] hash for Languoids (Issue #68)

@antipodite just to make sure: You would want to have hash(self.id) - i.e. the Glottocode - as hash for Languoid?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

xrotwang commented 2 years ago

@antipodite Maybe you could just work with a modified fork and figure out which hash works best for you - and upon next Glottolog data release we can try out whether that interferes in some way with data curation? We do a lot of comparing when putting the data for the web site together, so I wouldn't be 100% comfortable with such a change just because the pyglottolog package checks pass.

antipodite commented 2 years ago

yeah sounds good