lingdb / Sound-Comparisons

Exploring phonetic diversity across language families —
13 stars 8 forks source link

Data Quality Control #480

Open PaulHeggarty opened 6 years ago

PaulHeggarty commented 6 years ago

Miscellaneous Quality Control Issues with Slavic

Languages with many sound files failing to show -- missing from the server?

There are cases where languages that should have a full set of sound files do not. This is most likely caused by those files failing to load properly to the server (or having somehow been deleted since). Please check these up and fix. Please also use the diagnostic tools to identify and fix all other such cases. Here are some first examples.

Transcriptions do not match sound files

For Germanic study:

For Malakula study: In some Malakula languages, there's a mismatch that affects many words in that language. Listen to 'woman' in the languages below, for example: the transcription is clearly for another word, and does not match the sound file. (And a few more examples and you'll hear/see the other mismatches.

There are two possible causes: 1) There is a problem with the excel grid, which does not sync correctly with the big sound file for this language, so the segmentation and export went wrong. 2) When the transcriptions were pasted in from Aviva's own excel file into the SndComp transcriptions template, there was a mismatch between word and transcription that started at some point in the transcriptions list. 'Woman' should be the transcription in the next cell down, for example.

We need to (manually) identify all the languages affected, and go back to fix whatever is the cause in each case, and either re-segment the sound files, or re-create the SQL transcription files.

Transcriptions too long and strange

Transcriptions missing on the website

Both sound files and transcriptions missing on the website

Transcription of lex2 online, but the sound is missing

no master TextGrid found


Region 2

Region 6

Languages in the Transcription Excel file but not on the website

Language is NEITHER in Avivas compiled transcriptions NOR in Lauras Rgn-tables BUT online

empty pages

LuPaschen commented 5 years ago

Adding random stuff here, updating regularly

Germanic: Central German

Germanic: Upper German

Germanic: Diaspora German
