Closed SimonGreenhill closed 4 years ago
I think that's too simplistic. There may be more relations between LanguageTable
and CognateTable
. One could say that this is a shortcut for the "standard" relation via FormTable->languageReference
and CognateTable->formReference
- but even then this breaks down for the case of borrowings.
I think the only syntax I could sort-of get behind for this kind of functionality would be the SQL that could also be used against the SQLite db created from a dataset - and I suspect that in many cases creating the sqlite db and then running the query on the fly will not be much slower. Let me experiment a bit.
Yeah, I can that it's a can of worms. I meant to experiment and compare how fast it was to load the CLDF iterating over csv and loading to an in-memory sqlite but haven't had a chance so am keen to see what you come up with.
I can't be any faster, I guess, since currently loading implies iterating over all rows in all tables. But if the db is in-memory, it might be only slightly slower.
Hm. First small test: Creating database for lexibank/abvd in memory and on disk are both about 35 secs. So writing to disk doesn't seem to add much (for a 75MB database).
wow, and this should be one of the big datasets where any differences are noticeable. SQLite is amazing.
So it makes no sense to implement this in python, rather if you want to do this then convert to sqlite. wontfix!
Oh -- do we have some documentation on how best to do this?
Hm. Not really. This is about it https://github.com/cldf/pycldf/blob/dcd5acd68a3ace91e9d8cb121b2eaf7132c1f78c/src/pycldf/db.py#L2-L36 Will add something to the pycldf README.
thanks!
It would be really helpful to have a simple join ability. Right now to load cognates for a
Form
, I'm doing something likewhereas something like:
This of course makes data loading more complicated and risks trying to reinvent a database engine (and what types of joins should be supported). But we do know the relationships from
metadata.json
and I'm worried I'll manuallyjoin
things incorrectly.