philcombiths / PhonDPA

Phon_DPA is a python script which converts raw data in Excel (xls) from the Developmental Phonologies Archive (Gierut, 2015) into Phon-readable comma-separated value (csv) files.
Apache License 2.0
1 stars 0 forks source link

Many R-side diacritics not translated as expected #3

Closed philcombiths closed 4 years ago

philcombiths commented 4 years ago

Many R-side diacritics not translated as expected. Records with errors are ignored by Phon. Examples: Data.2093_CCP_CCP Pre 19 IPA Actual 1 bʊɹˀ̵ Data.2093_OCP_OCP PS 90 IPA Actual 1 ɡɑɹʰ̵ Data.4746_CCP_CCP Post 71 IPA Actual 1 d̥oʊᵗ̵̪ Data.4746_OCP_OCP Pre 70 IPA Actual 1 hoʊᵊ̵

philcombiths commented 4 years ago

This issue is likely caused by these records being interpreted as multiple productions due to the number of spaces in their cells. This can be fixed by requiring 5 spaces to trigger a multiple production; however, this causes ~1.5k multiple productions to not be counted by the script. May require a more complex re pattern or these instances treated as exceptions.