IPS-LMU / emuR

The main R package for the EMU Speech Database Management System (EMU-SDMS)
http://ips-lmu.github.io/EMU.html
23 stars 15 forks source link

Running g2p with mapping outside web service #232

Closed nikopartanen closed 4 years ago

nikopartanen commented 4 years ago

Hi,

Sorry if this question is beyond the scope of emuR. I have created a SAMPA-mapping for the language I work with, Komi-Zyrian, and it seems to work really well now. I use it in emuR pipeline like this:

runBASwebservice_g2pForPronunciation(handle = db_handle,
                                     orthoAttributeDefinitionName = 'ORT',
                                     language = 'und', 
                                     canoAttributeDefinitionName = 'KAN', 
                                     params = list(embed = 'maus', imap=RCurl::fileUpload("kpv-sampa.txt")), 
                                     resume = FALSE, 
                                     verbose = TRUE)

However, I would like to run this mapping into a very large transcribed corpus just to inspect that all strange edge-cases are also going as intended. And generally using this mapping as orthography > SAMPA transliteration pattern would be highly useful in different situations. So is it available somewhere as an independent tool or script that I could just wrap into something from R, for example? I understood that there is a Perl script called g2p.pl somewhere, but I couldn't find that from anywhere.

Thanks for your great work!

raphywink commented 4 years ago

Hi Niko,

sorry for the late reply on this one. Somehow I didn't get an email notification and my main focus has been the EMU-webApp in the last few weeks so I havn't checked the emuR issues in a while.

For what you are after I'd recommend looking at the BAS web services: The g2p.pl script (if you are referring to the g2p tool written by Uwe Reichel) is available as a web service there: https://clarin.phonetik.uni-muenchen.de/BASWebServices/interface/Grapheme2Phoneme You can also access that service using curl: https://clarin.phonetik.uni-muenchen.de/BASWebServices/help/help_developer#help_developer and https://clarin.phonetik.uni-muenchen.de/BASWebServices/services/help which you can access from R using the RCurl package.

Hope this helps

Greetings from Munich

nikopartanen commented 4 years ago

Thanks a lot! This helped very much, and everything works as I expected.