Closed LinguList closed 6 years ago
I imagine a call structure in the form of:
from pyclts.metadata import MetaData
from pyclts.clts import CLTS, translate
bipa = CLTS('bipa')
dolgo = MetaData('dolgopolsky')
translate('t o x t a', bipa, dolgo)
So the metadata, or at least parts of it, could have a similar call/get structure: str(dolgo.get(sound))
would return the Dolgopolsky sound class. And if we do this for Phoible, it would return the phoible character. The difference between metadata and transcription systems would then be that metadata is a fixed set of characters, while transcription systems can generate new characters from their diacritics in combination with the base sounds. Of course, metadata can be more, e.g., have an ID or a URL, but in most cases, we'd still assume that people assign a certain GRAPHEME to a given meta-datapoint that is related to sounds, so str(metadata.get(sound))
would basically behave similar in MetaData
and CLTS
.
Given these two basic data types: transcription systems, and data bases, one could even think of changing the names in the code:
pyclts.clts.CLTS
-> pyclts.transcription.Transcription
(or TranscriptionSystem
, although this name is long)pyclts.metadata
-> pyclts.data.Dataset
we have no real problems to get the normal sounds, but metadata should probably also be retrievable conveniently, maybe by loading upon request, etc.