Closed LinguList closed 3 years ago
BTW: this adds a new command:
$ clts map phoible/graphemes.tsv
This command maps the unmapped sounds in the file graphemes.tsv
and writes the mapped version to a file graphemes.mapped.tsv
. This file should be manually checked (automatically identified pre-nasalized nasals, etc. are marked by an asterisk, clusters are marked by (!) for extra attention). After annotation, graphemes.tsv
can be pushed as a PR and checked by the CLTS team.
This allows us a similar mapping procedure as in concepticon, where we effectively map things manually, but use clts for a pre-processing.
So far, CLTS does everything automatically, this adds more direct control.
Yes, good, it is pretty much what I did this morning, in quick&dirty, to map JIPA.
But does this mean we would not use the Google Sheet after all? Once this is approved&merged, I could just go through the graphemes.tsv
in Phoible's transcription data, perhaps before handling it to @cormacanderson ?
As I want @Cormacanderson and you to work together on this, (and together with me) we use the google sheet, where I am now uploading the files. For any additional datasets, you can do the manual way, but I assume Cormac prefers to work on a spreadsheet.
Hi both. I'm ready to start going through whenever. Just explain to me clearly what you would like me to do. My preference is a .csv or .tsv file, to work on it offline and then to upload it again. Would that work?
That also works, so you can use again the google sheet: just copy one sheet there, work on it, and upload directly to the sheet if that is okay?
Merging #26 into master will decrease coverage by
3.58%
. The diff coverage is34.93%
.
@@ Coverage Diff @@
## master #26 +/- ##
==========================================
- Coverage 96.68% 93.10% -3.59%
==========================================
Files 30 31 +1
Lines 1356 1435 +79
==========================================
+ Hits 1311 1336 +25
- Misses 45 99 +54
Impacted Files | Coverage Δ | |
---|---|---|
src/pyclts/models.py | 100.00% <ø> (ø) |
|
src/pyclts/commands/map.py | 11.66% <11.66%> (ø) |
|
src/pyclts/transcriptionsystem.py | 96.70% <95.65%> (-0.23%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update dfddc28...cbdb922. Read the comment docs.
@xrotwang and @tresoldi, this contains some changes that I consider quite important to make sure that we do not overgeneralize (as we were doing before, see #25). This adds more code to the functions, but they are in my opinion secure. It will slidely increase unknown sounds in clts, I think, but otherwise it should not do much harm.