Closed Anaphory closed 6 years ago
I had an idea some time ago of making a bit map of all features (each possible bipa feature a bit, set to true or false accordingly). It would be a kind of locality sensitive hashing, allowing to compare sounds up to a point.
Just an idea, but wouldn't take long to implement, and you could guarantee a perfect hashing (one hash mapping to only one sound).
Em 21 de fev de 2018 12:34 PM, "Gereon Kaiping" notifications@github.com escreveu:
It would be very nice if Sound objects were hashable. I'm not entirely sure what should be hashed, but creating sets of Sounds (eg. for phoneme inventories) and having dicts with sound sequence (tuple of Sound) keys looks useful to me already just from building some toy functionality on the basis of pyclts.
There is probably a better implementation than
def __hash__(self):
return hash(self.name)
but I haven't delved into the intestines of the objects or what TranscriptionSystems and other classes might need here so say what that might be.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cldf/clts/issues/107, or mute the thread https://github.com/notifications/unsubscribe-auth/AAar92dpHhCMFL0CMFHWxAeXE4MXOBQxks5tXDeIgaJpZM4SN1sB .
Sounds somewhat like using the feature vector as hash.
Tiago Tresoldi notifications@github.com schrieb am Mi., 21. Feb. 2018, 16:40:
I had an idea some time ago of making a bit map of all features (each possible bipa feature a bit, set to true or false accordingly). It would be a kind of locality sensitive hashing, allowing to compare sounds up to a point.
Just an idea, but wouldn't take long to implement, and you could guarantee a perfect hashing (one hash mapping to only one sound).
Em 21 de fev de 2018 12:34 PM, "Gereon Kaiping" notifications@github.com escreveu:
It would be very nice if Sound objects were hashable. I'm not entirely sure what should be hashed, but creating sets of Sounds (eg. for phoneme inventories) and having dicts with sound sequence (tuple of Sound) keys looks useful to me already just from building some toy functionality on the basis of pyclts.
There is probably a better implementation than
def hash(self): return hash(self.name)
but I haven't delved into the intestines of the objects or what TranscriptionSystems and other classes might need here so say what that might be.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cldf/clts/issues/107, or mute the thread < https://github.com/notifications/unsubscribe-auth/AAar92dpHhCMFL0CMFHWxAeXE4MXOBQxks5tXDeIgaJpZM4SN1sB
.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cldf/clts/issues/107#issuecomment-367367418, or mute the thread https://github.com/notifications/unsubscribe-auth/AA1HKOnGO1-l_aebpOsOoLNkjH5kMJ8Aks5tXDjYgaJpZM4SN1sB .
the feature vector is a frozenset, this is hashable already, ain't it?
if you frozenset your features, you can access them via bipa.features
. Or do you mean something differently?
Yes, but I was thinking of representing it as a normal array of bytes, like a "normal" hash (i.e., hexadecimal representation and so on) -- of course, deep down it is just a number. But again, just an idea I had some time ago
Em 21 de fev de 2018 12:52 PM, "Johann-Mattis List" < notifications@github.com> escreveu:
if you frozenset your features, you can access them via bipa.features. Or do you mean something differently?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cldf/clts/issues/107#issuecomment-367371563, or mute the thread https://github.com/notifications/unsubscribe-auth/AAar92pE9xHbcoiKzGUl8HS6pdCXe3iWks5tXDujgaJpZM4SN1sB .
is this still considered to be important for anybody? If not I'll just close for the time being...
It would be very nice if
Sound
objects were hashable. I'm not entirely sure what should be hashed, but creating sets ofSound
s (eg. for phoneme inventories) and having dicts with sound sequence (tuple
ofSound
) keys looks useful to me already just from building some toy functionality on the basis of pyclts.There is probably a better implementation than
but I haven't delved into the intestines of the objects or what
TranscriptionSystem
s and other classes might need here so say what that might be.