lexibank / northperulex

Creative Commons Attribution 4.0 International
0 stars 0 forks source link

Missing languages #26

Closed MuffinLinwist closed 3 months ago

MuffinLinwist commented 5 months ago

@FredericBlum, we were missing more than two languages on the CLDF conversion. There was a name mismatch for Wampis (Huambisa). Chayahuita's data was missing because it wasn't pulled from the Lexibank data. Now we have it. Besides, we didn't have the Huitotoan languages on etc/languages.tsv.

Now we are only missing the ortho-profile for Wampis, Oceania, WitotoMurui, WitotoNipode, and WitotoMinica. I'll handle them.

There is an ID mismatch problem, however, with the ID since the data that is being pulled from Lexibank has language names for three Huitotoan languages: Witoto Murui, Witoto Ni̵pode, Witoto Mi̵ni̵ca.

FredericBlum commented 5 months ago

There is an ID mismatch problem, however, with the ID since the data that is being pulled from Lexibank has language names for three Huitotoan languages: Witoto Murui, Witoto Ni̵pode, Witoto Mi̵ni̵ca.

I don't understand what mismatch you are referring to. They all have different names and glottocodes?

MuffinLinwist commented 5 months ago

There is an ID mismatch problem, however, with the ID since the data that is being pulled from Lexibank has language names for three Huitotoan languages: Witoto Murui, Witoto Ni̵pode, Witoto Mi̵ni̵ca.

I don't understand what mismatch you are referring to. They all have different names and glottocodes?

This is the output I get:

ValueError: invalid CLDF identifier LanguageTable-ID: WitotoMi̵ni̵ca
ValueError: invalid CLDF identifier LanguageTable-ID: WitotoNi̵pode

The data extracted from Lexibank (and later merged into raw.tsv) has the LanguageID for these languages with the vowel. My question is where to change this.

MuffinLinwist commented 5 months ago

Now, only the Witoto's ortho-profiles are missing.

MuffinLinwist commented 4 months ago

Already the Huitotoan's ortho-profiles are in the dataset. I did not find any info on Nipode and Minica's diphthongs, perhaps they are the same as the ones on Murui?

MuffinLinwist commented 4 months ago

I pulled Resígaro's data from Lexibank, so next step is to work on the orthographic profile.

MuffinLinwist commented 4 months ago

About the open questions and comments in this PR (and current status of the dataset):

  1. no solution yet for the few missing concepts in Boran family.
  2. already fixed the situation with the invalid names in the dataset. just an easy fix with replacement duplets in raw/merge.py.
  3. there is no orthographic profile missing from the data that it's current on the dataset.
  4. I also added data from Resigaro.

So if everything seems fit, and after I fix the missing concepts for Boran, we can merge this PR and I can create new ones with Ticuna and Yanesha data once I gather them.

FredericBlum commented 4 months ago

Already the Huitotoan's ortho-profiles are in the dataset. I did not find any info on Nipode and Minica's diphthongs, perhaps they are the same as the ones on Murui?

Yes, I would assume that they are the same, since the languages are very closely related. In doubt, check some cognates where the diphtongs occur

FredericBlum commented 4 months ago

Regarding the missing concepts: Some are simply not in the Swadesh list. For others, we could check whether the concepts can be adapted, like GO --> WALK, or RAIN (PRECIPITATION) --> RAIN (RAINING). Please consult the Spanish glosses for this. The other cases have simply to be ignored. Please also note that there are currently two transcription errors. Please use the Grouped_Sounds category as in the blumpanotacana dataset to fix this.

MuffinLinwist commented 3 months ago
  • Did you submit a fix to the original lexibank dataset for the replacements as well? This would be important for the LB2.0 releases

Regarding the missing concepts: Some are simply not in the Swadesh list. For others, we could check whether the concepts can be adapted, like GO --> WALK, or RAIN (PRECIPITATION) --> RAIN (RAINING). Please consult the Spanish glosses for this. The other cases have simply to be ignored. Please also note that there are currently two transcription errors. Please use the Grouped_Sounds category as in the blumpanotacana dataset to fix this.

I've been trying to use GroupedSounds but cannot do it since data is yet not segmented. is there a way to do this at this stage, @FredericBlum? The latest commit contains the status of the lexibank script I'm working with.

FredericBlum commented 3 months ago

Please add them as individual segments, we can group them later in Edictor and then handle this checking the individual alignments

MuffinLinwist commented 3 months ago

I noticed, during the parsing of the Martius data, there were some concepts missing. I adapted the map_martius.py script and have this output:

Unmapped entries:
Doculect: Juri, Form: abi, Notes: Not Mapped
Doculect: Juri, Form: adfer!, Notes: Not Mapped
Doculect: Juri, Form: aegroto, Notes: Not Mapped
Doculect: Juri, Form: amita, Notes: Not Mapped
Doculect: Juri, Form: anima, Notes: Not Mapped
Doculect: Juri, Form: anima, Notes: Not Mapped
Doculect: Juri, Form: annus (unus), Notes: Not Mapped
Doculect: Juri, Form: anus, Notes: Not Mapped
Doculect: Juri, Form: habesne aquam?, Notes: Not Mapped
Doculect: Juri, Form: arcus coelestis, Notes: Not Mapped
Doculect: Juri, Form: auditus (meus?), Notes: Not Mapped
Doculect: Juri, Form: avia, Notes: Not Mapped
Doculect: Juri, Form: avunculus, Notes: Not Mapped
Doculect: Juri, Form: avunculus, Notes: Not Mapped
Doculect: Juri, Form: bellum gero, Notes: Not Mapped
Doculect: Juri, Form: brachium (meum), Notes: Not Mapped
Doculect: Juri, Form: brachium (meum), Notes: Not Mapped
Doculect: Juri, Form: brachium (meum), Notes: Not Mapped
Doculect: Juri, Form: calcaneus, Notes: Not Mapped
Doculect: Juri, Form: calidus, a, um, Notes: Not Mapped
Doculect: Juri, Form: capio (hostem), Notes: Not Mapped
Doculect: Juri, Form: cerevisia e granis mais, Notes: Not Mapped
Doculect: Juri, Form: chorda, Notes: Not Mapped
Doculect: Juri, Form: cilium, Notes: Not Mapped
Doculect: Juri, Form: clavicula, Notes: Not Mapped
Doculect: Juri, Form: coeruleus, Notes: Not Mapped
Doculect: Juri, Form: coeruleus, Notes: Not Mapped
Doculect: Juri, Form: cognatus, Notes: Not Mapped
Doculect: Juri, Form: connubo, Notes: Not Mapped
Doculect: Juri, Form: corbis, Notes: Not Mapped
Doculect: Juri, Form: costa, Notes: Not Mapped
Doculect: Juri, Form: culter, Notes: Not Mapped
Doculect: Juri, Form: da mihi, Notes: Not Mapped
Doculect: Juri, Form: diabolus, Notes: Not Mapped
Doculect: Juri, Form: diabolus, Notes: Not Mapped
Doculect: Juri, Form: dico, eloquor, Notes: Not Mapped
Doculect: Juri, Form: digit pedis, Notes: Not Mapped
Doculect: Juri, Form: durus, a, um, Notes: Not Mapped
Doculect: Juri, Form: femur, Notes: Not Mapped
Doculect: Juri, Form: femur, Notes: Not Mapped
Doculect: Juri, Form: foedus, a, um, Notes: Not Mapped
Doculect: Juri, Form: foedus, a, um, Notes: Not Mapped
Doculect: Juri, Form: frigidus, a, um, Notes: Not Mapped
Doculect: Juri, Form: gusto, Notes: Not Mapped
Doculect: Juri, Form: gusto, Notes: Not Mapped
Doculect: Juri, Form: hallux, Notes: Not Mapped
Doculect: Juri, Form: hebdomas una, Notes: Not Mapped
Doculect: Juri, Form: hesperus, Notes: Not Mapped
Doculect: Juri, Form: homines multi, Notes: Not Mapped
Doculect: Juri, Form: homines pauci, Notes: Not Mapped
Doculect: Juri, Form: hostis, Notes: Not Mapped
Doculect: Juri, Form: labium, Notes: Not Mapped
Doculect: Juri, Form: labium, Notes: Not Mapped
Doculect: Juri, Form: lacertus, Notes: Not Mapped
Doculect: Juri, Form: latus, a, um, Notes: Not Mapped
Doculect: Juri, Form: lectus pensilis, Notes: Not Mapped
Doculect: Juri, Form: lucifer, Notes: Not Mapped
Doculect: Juri, Form: luna prima, Notes: Not Mapped
Doculect: Juri, Form: luna prima, Notes: Not Mapped
Doculect: Juri, Form: luna nova, Notes: Not Mapped
Doculect: Juri, Form: luna nova, Notes: Not Mapped
Doculect: Juri, Form: luna plena, Notes: Not Mapped
Doculect: Juri, Form: luna plena, Notes: Not Mapped
Doculect: Juri, Form: luna decrescens, Notes: Not Mapped
Doculect: Juri, Form: macer, Notes: Not Mapped
Doculect: Juri, Form: magus, praestigiator, Notes: Not Mapped
Doculect: Juri, Form: mala, Notes: Not Mapped
Doculect: Juri, Form: malus, a, um, Notes: Not Mapped
Doculect: Juri, Form: mamma, Notes: Not Mapped
Doculect: Juri, Form: maritus (conjux), Notes: Not Mapped
Doculect: Juri, Form: membrum virile, Notes: Not Mapped
Doculect: Juri, Form: membrum virile, Notes: Not Mapped
Doculect: Juri, Form: membrum muliebre, Notes: Not Mapped
Doculect: Juri, Form: membrum muliebre, Notes: Not Mapped
Doculect: Juri, Form: mensis unus, Notes: Not Mapped
Doculect: Juri, Form: meridies, Notes: Not Mapped
Doculect: Juri, Form: meridies, Notes: Not Mapped
Doculect: Juri, Form: meus, Notes: Not Mapped
Doculect: Juri, Form: mingo, Notes: Not Mapped
Doculect: Juri, Form: mollis, e, Notes: Not Mapped
Doculect: Juri, Form: mortuus (est), Notes: Not Mapped
Doculect: Juri, Form: mulier mea, Notes: Not Mapped
Doculect: Juri, Form: mulier mea, Notes: Not Mapped
Doculect: Juri, Form: mulier tua, Notes: Not Mapped
Doculect: Juri, Form: mulier sua, Notes: Not Mapped
Doculect: Juri, Form: nox media, Notes: Not Mapped
Doculect: Juri, Form: nox media, Notes: Not Mapped
Doculect: Juri, Form: occido, Notes: Not Mapped
Doculect: Juri, Form: occiput, Notes: Not Mapped
Doculect: Juri, Form: olfacio, Notes: Not Mapped
Doculect: Juri, Form: olfacio, Notes: Not Mapped
Doculect: Juri, Form: olla, Notes: Not Mapped
Doculect: Juri, Form: omnes, Notes: Not Mapped
Doculect: Juri, Form: omnes, Notes: Not Mapped
Doculect: Juri, Form: orion, Notes: Not Mapped
Doculect: Juri, Form: orion, Notes: Not Mapped
Doculect: Juri, Form: oro, Notes: Not Mapped
Doculect: Juri, Form: os, oris, Notes: Not Mapped
Doculect: Juri, Form: os, oris, Notes: Not Mapped
Doculect: Juri, Form: os, oris, Notes: Not Mapped
Doculect: Juri, Form: os, ossis, Notes: Not Mapped
Doculect: Juri, Form: os, ossis, Notes: Not Mapped
Doculect: Juri, Form: panis mandioccae, Notes: Not Mapped
Doculect: Juri, Form: patella, Notes: Not Mapped
Doculect: Juri, Form: patella, Notes: Not Mapped
Doculect: Juri, Form: pleiades, Notes: Not Mapped
Doculect: Juri, Form: pleiades, Notes: Not Mapped
Doculect: Juri, Form: planto, Notes: Not Mapped
Doculect: Juri, Form: poples, Notes: Not Mapped
Doculect: Juri, Form: profundus, Notes: Not Mapped
Doculect: Juri, Form: profundus, Notes: Not Mapped
Doculect: Juri, Form: puella, Notes: Not Mapped
Doculect: Juri, Form: pulcher, Notes: Not Mapped
Doculect: Juri, Form: pulcher, Notes: Not Mapped
Doculect: Juri, Form: remus, Notes: Not Mapped
Doculect: Juri, Form: salto, Notes: Not Mapped
Doculect: Juri, Form: scapula, Notes: Not Mapped
Doculect: Juri, Form: sebum, Notes: Not Mapped
Doculect: Juri, Form: semita, via, Notes: Not Mapped
Doculect: Juri, Form: sepelio, Notes: Not Mapped
Doculect: Juri, Form: sibilo, Notes: Not Mapped
Doculect: Juri, Form: sic, sane, recte, Notes: Not Mapped
Doculect: Juri, Form: sic, sane, recte, Notes: Not Mapped
Doculect: Juri, Form: sicera, Notes: Not Mapped
Doculect: Juri, Form: supercilium, Notes: Not Mapped
Doculect: Juri, Form: supercilium, Notes: Not Mapped
Doculect: Juri, Form: sylva, Notes: Not Mapped
Doculect: Juri, Form: sylva, Notes: Not Mapped
Doculect: Juri, Form: tabacum, Notes: Not Mapped
Doculect: Juri, Form: tempus matutinum, Notes: Not Mapped
Doculect: Juri, Form: tempus matutinum, Notes: Not Mapped
Doculect: Juri, Form: testiculi, Notes: Not Mapped
Doculect: Juri, Form: testiculi, Notes: Not Mapped
Doculect: Juri, Form: tugurium, Notes: Not Mapped
Doculect: Juri, Form: tugurium nostrum, Notes: Not Mapped
Doculect: Juri, Form: veni huc!, Notes: Not Mapped
Doculect: Juri, Form: vespere, Notes: Not Mapped
Doculect: Juri, Form: umbilicus, Notes: Not Mapped
Doculect: Juri, Form: umbilicus, Notes: Not Mapped
Doculect: Juri, Form: unguis, Notes: Not Mapped
Doculect: Juri, Form: unguis, Notes: Not Mapped
Doculect: Juri, Form: volo, velle, Notes: Not Mapped
Doculect: Juri, Form: 1, Notes: Not Mapped
Doculect: Juri, Form: 1, Notes: Not Mapped
Doculect: Juri, Form: 1, Notes: Not Mapped
Doculect: Juri, Form: 2, Notes: Not Mapped
Doculect: Juri, Form: 2, Notes: Not Mapped
Doculect: Juri, Form: 2, Notes: Not Mapped
Doculect: Juri, Form: tapirus americanus, Notes: Not Mapped
Doculect: Juri, Form: tapirus americanus, Notes: Not Mapped
Doculect: Juri, Form: felis onça, Notes: Not Mapped
Doculect: Juri, Form: felis onça, Notes: Not Mapped
Doculect: Juri, Form: felis pardalis (maracaja), Notes: Not Mapped
Doculect: Juri, Form: felis concolor (çuçuarana), Notes: Not Mapped
Doculect: Juri, Form: canis azarae, Notes: Not Mapped
Doculect: Juri, Form: cebus fatuellus (prego), Notes: Not Mapped
Doculect: Juri, Form: cebus gracilis (caiarara), Notes: Not Mapped
Doculect: Juri, Form: callithrix torquato (oyapussá), Notes: Not Mapped
Doculect: Juri, Form: lagothrix canus et Humboldti Geoffr. (barrigudo), Notes: Not Mapped
Doculect: Juri, Form: pithecia hirsuta (paraoá), Notes: Not Mapped
Doculect: Juri, Form: pithecia ouacary (simia melanocephalus Hb.), Notes: Not Mapped
Doculect: Juri, Form: nyetipithecus felinus (yá), Notes: Not Mapped
Doculect: Juri, Form: dasypus (tatu) major, Notes: Not Mapped
Doculect: Juri, Form: dasypus minor, Notes: Not Mapped
Doculect: Juri, Form: nasua, Notes: Not Mapped
Doculect: Juri, Form: nasua, Notes: Not Mapped
Doculect: Juri, Form: hydrochoerus capivara, Notes: Not Mapped
Doculect: Juri, Form: dicotyles, Notes: Not Mapped
Doculect: Juri, Form: coelogenys paca, Notes: Not Mapped
Doculect: Juri, Form: coelogenys paca, Notes: Not Mapped
Doculect: Juri, Form: dasyprocta aguti, Notes: Not Mapped
Doculect: Juri, Form: dasyprocta aguti, Notes: Not Mapped
Doculect: Juri, Form: myrmecophaga jubata, Notes: Not Mapped
Doculect: Juri, Form: bradypus tridaetylus, Notes: Not Mapped
Doculect: Juri, Form: manatus, Notes: Not Mapped
Doculect: Juri, Form: delphinus, Notes: Not Mapped
Doculect: Juri, Form: crax globulosa (mutum de faba vel açu), Notes: Not Mapped
Doculect: Juri, Form: crax tuberosa (mutum de vargem), Notes: Not Mapped
Doculect: Juri, Form: crax urumutum, Notes: Not Mapped
Doculect: Juri, Form: psittacus macao, Notes: Not Mapped
Doculect: Juri, Form: psittacus ararauna, Notes: Not Mapped
Doculect: Juri, Form: psittacus (minor) perikito, Notes: Not Mapped
Doculect: Juri, Form: rhamphastos, Notes: Not Mapped
Doculect: Juri, Form: penelope aracuan (aracuan), Notes: Not Mapped
Doculect: Juri, Form: penelope cumanensis (cuxuby), Notes: Not Mapped
Doculect: Juri, Form: gallinula plumbea (saracura), Notes: Not Mapped
Doculect: Juri, Form: anas brasiliensis, Notes: Not Mapped
Doculect: Juri, Form: emys amazonica, Notes: Not Mapped
Doculect: Juri, Form: agama (camaleâo), Notes: Not Mapped
Doculect: Juri, Form: bufo agoa, Notes: Not Mapped
Doculect: Juri, Form: rana, Notes: Not Mapped
Doculect: Juri, Form: lacerta, Notes: Not Mapped
Doculect: Juri, Form: crocodilus niger, Notes: Not Mapped
Doculect: Juri, Form: scarabaeus, Notes: Not Mapped
Doculect: Juri, Form: fructus musae, Notes: Not Mapped
Doculect: Juri, Form: fructus musae, Notes: Not Mapped
Doculect: Coëruna, Form: albus, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: anima, Notes: Not Mapped
Doculect: Coëruna, Form: audire, Notes: Not Mapped
Doculect: Coëruna, Form: avia, Notes: Not Mapped
Doculect: Coëruna, Form: avunculus, Notes: Not Mapped
Doculect: Coëruna, Form: bibo, ere, Notes: Not Mapped
Doculect: Coëruna, Form: brevis, e, Notes: Not Mapped
Doculect: Coëruna, Form: caeruleus, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: connubo, ere, Notes: Not Mapped
Doculect: Coëruna, Form: digitus minimus, Notes: Not Mapped
Doculect: Coëruna, Form: dormio, ire, Notes: Not Mapped
Doculect: Coëruna, Form: edo, ere, Notes: Not Mapped
Doculect: Coëruna, Form: femur, Notes: Not Mapped
Doculect: Coëruna, Form: flavus, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: foedus, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: frons, tis, Notes: Not Mapped
Doculect: Coëruna, Form: gusto, are, Notes: Not Mapped
Doculect: Coëruna, Form: homo (vir), Notes: Not Mapped
Doculect: Coëruna, Form: labium, Notes: Not Mapped
Doculect: Coëruna, Form: lacerta, Notes: Not Mapped
Doculect: Coëruna, Form: lacertus, Notes: Not Mapped
Doculect: Coëruna, Form: latus, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: longus, a, u, Notes: Not Mapped
Doculect: Coëruna, Form: lucifer (sidus), Notes: Not Mapped
Doculect: Coëruna, Form: luna prima, Notes: Not Mapped
Doculect: Coëruna, Form: luna nova, Notes: Not Mapped
Doculect: Coëruna, Form: luna plena, Notes: Not Mapped
Doculect: Coëruna, Form: luna decrescens, Notes: Not Mapped
Doculect: Coëruna, Form: macer, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: magnus, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: mala, Notes: Not Mapped
Doculect: Coëruna, Form: membrum virile, Notes: Not Mapped
Doculect: Coëruna, Form: membrum muliebre, Notes: Not Mapped
Doculect: Coëruna, Form: meridies, Notes: Not Mapped
Doculect: Coëruna, Form: mingo, ere, Notes: Not Mapped
Doculect: Coëruna, Form: multus, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: niger, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: olfacio, ere, Notes: Not Mapped
Doculect: Coëruna, Form: omnes, Notes: Not Mapped
Doculect: Coëruna, Form: orion, Notes: Not Mapped
Doculect: Coëruna, Form: oro, are, Notes: Not Mapped
Doculect: Coëruna, Form: os, oris, Notes: Not Mapped
Doculect: Coëruna, Form: os, ossis, Notes: Not Mapped
Doculect: Coëruna, Form: parvus, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: patella, Notes: Not Mapped
Doculect: Coëruna, Form: paucus, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: pectus, oris, Notes: Not Mapped
Doculect: Coëruna, Form: pes, pedis, Notes: Not Mapped
Doculect: Coëruna, Form: pinguis, e, Notes: Not Mapped
Doculect: Coëruna, Form: pleiades, Notes: Not Mapped
Doculect: Coëruna, Form: pulcher, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: ruber, a, um, Notes: Not Mapped
Doculect: Coëruna, Form: salto, are, Notes: Not Mapped
Doculect: Coëruna, Form: sapio, ere, Notes: Not Mapped
Doculect: Coëruna, Form: sepelio, ire, Notes: Not Mapped
Doculect: Coëruna, Form: sibilo, are, Notes: Not Mapped
Doculect: Coëruna, Form: sic, sane, recte, Notes: Not Mapped
Doculect: Coëruna, Form: sicera, Notes: Not Mapped
Doculect: Coëruna, Form: supercilium, Notes: Not Mapped
Doculect: Coëruna, Form: sylva, Notes: Not Mapped
Doculect: Coëruna, Form: tempus matutinum, Notes: Not Mapped
Doculect: Coëruna, Form: testiculi, Notes: Not Mapped
Doculect: Coëruna, Form: umbilicus, Notes: Not Mapped
Doculect: Coëruna, Form: unguis, Notes: Not Mapped
Doculect: Coëruna, Form: venor, ari, Notes: Not Mapped
Doculect: Coëruna, Form: vespere, Notes: Not Mapped
Doculect: Coëruna, Form: viridis, e, Notes: Not Mapped
Doculect: Coëruna, Form: tapirus americanus, Notes: Not Mapped
Doculect: Coëruna, Form: felix onça, Notes: Not Mapped
Doculect: Coëruna, Form: nasua, Notes: Not Mapped
Doculect: Coëruna, Form: dicotyles, Notes: Not Mapped
Doculect: Coëruna, Form: hydrochoerus capivara, Notes: Not Mapped
Doculect: Coëruna, Form: coelogenys paca, Notes: Not Mapped
Doculect: Coëruna, Form: dasyprocta aguti, Notes: Not Mapped
Doculect: Coëruna, Form: crocodilus, Notes: Not Mapped
Doculect: Coëruna, Form: bufo agoa, Notes: Not Mapped
Doculect: Coëruna, Form: rana, Notes: Not Mapped
Doculect: Coëruna, Form: scarabaeus, Notes: Not Mapped
Doculect: Coëruna, Form: fructus musae, Notes: Not Mapped
Doculect: Coëruna, Form: deus pronobis facit fluvium, sylvam, omnem aquam, omne!, Notes: Not Mapped
Doculect: Coëruna, Form: omne pronobis factum est, ut bene vivamus, Notes: Not Mapped
Doculect: Coëruna, Form: bonum esse oportet nos eliam, Notes: Not Mapped
Doculect: Coëruna, Form: bene et sine offensa vivere cum sociis, Notes: Not Mapped
Doculect: Pebas, Form: capilli, Notes: Not Mapped
Doculect: Pebas, Form: coeruleus, Notes: Not Mapped
Doculect: Pebas, Form: cymba, Notes: Not Mapped
Doculect: Pebas, Form: diabolus, Notes: Not Mapped
Doculect: Pebas, Form: folia, Notes: Not Mapped
Doculect: Pebas, Form: frons, tis, Notes: Not Mapped
Doculect: Pebas, Form: hasta, Notes: Not Mapped
Doculect: Pebas, Form: lumen, Notes: Not Mapped
Doculect: Pebas, Form: mentum, Notes: Not Mapped
Doculect: Pebas, Form: nidus, Notes: Not Mapped
Doculect: Pebas, Form: os, oris, Notes: Not Mapped
Doculect: Pebas, Form: sabulum, Notes: Not Mapped
Doculect: Pebas, Form: sidera, Notes: Not Mapped
Doculect: Pebas, Form: sic, sane, Notes: Not Mapped
Doculect: Pebas, Form: tubulus pro sagittulis explodendis, Notes: Not Mapped
Doculect: Pebas, Form: unguis, Notes: Not Mapped
Doculect: Pebas, Form: venenum sagittarum, Notes: Not Mapped
Doculect: Pebas, Form: via, semita, Notes: Not Mapped
Doculect: Pebas, Form: 1, Notes: Not Mapped
Doculect: Pebas, Form: 2, Notes: Not Mapped
Doculect: Pebas, Form: tigris, Notes: Not Mapped
Doculect: Pebas, Form: tigris, Notes: Not Mapped
Doculect: Pebas, Form: simia (in genere), Notes: Not Mapped
Doculect: Pebas, Form: midas, Notes: Not Mapped
Doculect: Pebas, Form: chrysothrix, Notes: Not Mapped
Doculect: Pebas, Form: callithrix nigrifrons Sp., Notes: Not Mapped
Doculect: Pebas, Form: ateles paniscus, Notes: Not Mapped
Doculect: Pebas, Form: mycetes, Notes: Not Mapped
Doculect: Pebas, Form: lagothrix, Notes: Not Mapped
Doculect: Pebas, Form: tapirus, Notes: Not Mapped
Doculect: Pebas, Form: crax, Notes: Not Mapped
Doculect: Pebas, Form: psittacus macao, Notes: Not Mapped
Doculect: Pebas, Form: psittacus, Notes: Not Mapped
Doculect: Pebas, Form: crocodilus, Notes: Not Mapped
Doculect: Pebas, Form: fructus musae, Notes: Not Mapped
Doculect: Pebas, Form: mandiocca, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: anima, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: avia, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: avunculus, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: bibo, ere, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: bellum gerere, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: brevis, e, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: caeruleus, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: capio, ere (captivos), Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: connubo, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: diabolus, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: digitus pedis major (hallux), Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: digitus minimus, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: edo, ere, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: femur, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: foedus, a, um, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: frons, tis, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: gusto, are, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: hesperus, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: homo (vir), Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: hostis, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: labium, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: lacerta, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: lacertus, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: lactus, a, um, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: longus, a, um, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: lucifer (sidus), Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: luna prima, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: luna nova, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: luna plena, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: luna decrescens, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: macer, a, um, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: magnus, a, um, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: mala, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: membrum virile, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: membrum muliebre, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: meridies, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: meus, a, um, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: mingo, ere, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: mortuus (est), Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: mullue, a, um, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: occido, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: olfacio, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: omnes, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: orion, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: oro, are, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: os, oris, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: os, ossis, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: parvus, a, um, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: patella, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: paucus, a, um, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: pes, pedis, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: pinguis, e, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: plantare, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: pleiades, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: pulcher, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: salto, are, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: sepelio, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: sibilio, are, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: sic, sane, recte, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: sicera, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: supercilium, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: sylva, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: tempus matutinum, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: testiculi, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: tuus, a, um, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: umbilicus, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: unguis, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: venor, ari, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: vespere, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: volo, velle, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: 1, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: 2, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: 11, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: 12, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: 13, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: 14, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: 15, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: tapirus americanus, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: felis onça, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: nasua, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: hydrochoerus capivara, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: dicotyles, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: coelogenys paca, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: dasyprocta aguti, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: crocodilus, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: bufo agoa, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: scarabaeus, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: fructus musae, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: Ego dux Joann Manoel, valens, alborum amicus, captivo omnes, Notes: Not Mapped
Doculect: MiranhaCarapana-Tapuya, Form: Deus fecit omne, coelum et fluvium et animalia pro nobis, Notes: Not Mapped

I'll work on a strategy to fix this tomorrow, @FredericBlum. If you have any suggestion, it is more than welcome. After fixing this, I'll run an updated cldf conversion with Martius' data.

FredericBlum commented 3 months ago

I don't think there's anything wrong, it's just that the mapping for Latin does not work as well as with German/Spanish. The only solution is to provide a manual mapping - something that also has to be done for German.

  1. Run the automated mappings
  2. Check manually for missing and wrong mappings (--> Save into new file!)
  3. Add manually modified file to data

We have done the same with the Tessmann data.

MuffinLinwist commented 3 months ago

Regarding the missing concepts: Some are simply not in the Swadesh list. For others, we could check whether the concepts can be adapted, like GO --> WALK, or RAIN (PRECIPITATION) --> RAIN (RAINING). Please consult the Spanish glosses for this. The other cases have simply to be ignored. Please also note that there are currently two transcription errors. Please use the Grouped_Sounds category as in the blumpanotacana dataset to fix this.

@FredericBlum my concern was mainly with the BLOW (OF WIND) concept that is present on the Swadesh-200 list. I think adapting some for the Yagua data, as you suggested, is a good idea so I'll implement that.

MuffinLinwist commented 3 months ago

I don't think there's anything wrong, it's just that the mapping for Latin does not work as well as with German/Spanish. The only solution is to provide a manual mapping - something that also has to be done for German.

  1. Run the automated mappings
  2. Check manually for missing and wrong mappings (--> Save into new file!)
  3. Add manually modified file to data

We have done the same with the Tessmann data.

This is done and present (as well as another run on the parse_martius.py script) on the last commit of the PR.

FredericBlum commented 3 months ago

Thanks for the quick responses! The source-one is important for the release, but most of the others we can ignore. Sorry for being pedantic about them

MuffinLinwist commented 3 months ago

Now the sources are fixed. I also adopted the solution for the breathy vowels in Urarina and the unmapped Yagua concepts cannot be adapt from those present in the Swadesh-1972-200 list. The only problem here are those entries for BLOW (OF WIND) not present in Boran languages.

FredericBlum commented 3 months ago

Cool thanks - does that mean we are good to merge?

MuffinLinwist commented 3 months ago

yes