cldf-clts / soundvectors

MIT License
1 stars 0 forks source link

Mattis check #11

Closed LinguList closed 4 months ago

LinguList commented 4 months ago

@arubehn and @justalingwist, I have now a new proposal and I think this makes the code much clearer. Integration of both linse and clts is now also possible, a clear advantage, but it is not a dependency of the code, it also works without it.

from linse.annotate import clts
from clts2vec import CLTS2Vec
c2v = CLTS2Vec(ts=clts)
c2v.get_vec("t") # returns binary vector
c2v.get_vec("t", vectorize=False) # returns Vector object 
str(c2v.get_vec("t", vectorize=False)) # returns string represntation as previously in utils
c2v.get_vec("t", vectorize=False).as_set() # returns feature set as previously in utils

Tests must still be integrated now.

arubehn commented 4 months ago

Thanks for the comments; see my individual replies above. One more question: If no transcription system is passed explicitly via ts, then no system is initiated at all. How do you feel about generating a CLTS object by default, if ts is not specified? I feel like the most frequent use case of this package would be to just pass sounds as strings, and have CLTS handle them under the hood. Requiring the user to explicitly pass an instance of CLTS (or any other transcription system) might possibly lead to some confusion and makes the package less straightforward to apply.

LinguList commented 4 months ago

I solved the problem now. The validatio accounts for None, but it requires less code.

LinguList commented 4 months ago

I would merge now. Later we can apply more fixes.

arubehn commented 4 months ago

Alright, looks good - thank you!