Closed fractaldragonflies closed 2 years ago
Okay. You can easily solve most issues.
Just replace the profiles by the ones you want snd keep their current names.
OK, cmd_makecldf overwrites forms and leaves empty fields for segments, graphemes, profile for the ids languages. Wold languages have segments, but no graphemes nor profile either.
But, I can take our model from KeyPano to use in upgrading use the Spanish or Portuguese profiles. So I'll work it that way.
If languages are missing, check their name in etc/languages.tsv and adjust it. I took the language name from a file you sent me, so some are probsbly wrong or simply missing
For donor values, can you check my code and check wold data, where I find the value?
OK, I belatedly read Mattis' suggestion below to look into the Borrowing.csv. There we find source_word (donor_word) and source_languoid (donor_language). This should do it. Thanks.
There was change in WOLD DB structure which confused me.
Borrowed os easy to modify in the code now.
I've coded True for Borrowed_score > 0.9 else False. At some time we might want to experiment with >0.6 instead. Although 0.0 and 1.0 probabilities are the overwhelmingly most common.
There is a surprising number of cases where the Borrowed_score = 0.0, but a Donor_language and Donor_value is given. These would seem to be errors, but I don't know what the correction would be.
You can make changes snd post as pr.
@fractaldragonflies, I can also do this later, but if you are impatient (as I would be), please have a look at the cldf data, and my code to get the language data, as this will maybe clarify some points of failure (e.g., why some languages don't get included now)
I'll give it a go... as you said most fixes should be straight forward. And I need to learn this! I'm still not clear where the donor_value comes from but that should be discoverable form the cldf of our earlier work in Pybor if not obvious in the Sabor. Thanks!
It is in the wold/cldf/borrowings.csv table, which I load right in the beginning of the lexibank script! There, you see that the donor value is there, and can be loaded, but is called differently!
OK, sorry I didn't take this into account above.
Issues were resolved by the improvestore branch and pull request.
Did clean install of Sabor and activation steps. Files were replaced with updated versions. Some essential and optional issues.
Lack Mapudungun and Wichi languages in both the languages and forms relations. Essential.
Lack donor_value (not used in automated borrowing identification, but in report out for user diagnosis of problems and verification of correct decision). Essential. Report snippet attached.
European Spanish and Portuguese forms are used -- instead of SpanishLA and PortugueseBR forms. Optional.
Borrowed is given as full text field as from WOLD -- could be dropped or replaced with boolean or binary indicator since we also have Borrowed_Score. Optional.
Segmented form for donor languages has been added. OK.
Graphemes and profile data has been added. OK.
I haven't actually tried to access yet with any code. Am assuming that some light adaption will be necessary, but hopefully nothing dramatic!
report-example.txt