m0ngr31 / kanzi

Alexa skill for controlling Kodi
https://lexigr.am
MIT License
427 stars 149 forks source link

Accept unicode in slots. #165

Closed jingai closed 7 years ago

jingai commented 7 years ago

This, in conjunction with m0ngr31/kodi-voice/pull/3, allows unicode in the slots that get passed to the skill.

These changes should address #113.

jingai commented 7 years ago

@mcl22 and @ausweider, I will need these translated:

nothing_playing: "There is nothing currently playing"

remaining_close: "It is nearly over"

remaining_mins: "There are {{ minutes }} minutes remaining"

remaining_min: "There is one minute remaining"

remaining_time: ", and it will end at {{ end_time }}"

jingai commented 7 years ago

@mcl22 and @ausweider, would the following statement make sense in German?

Du hast dopamin von böhse onkelz, ..., und mehr

jingai commented 7 years ago

@mcl22 and @ausweider, and these:

unknown_playing: "The currently playing item is unknown"

current_show_is: "The currently playing tv show is"

current_song_is: "The currently playing song is"

on_the_album: "on the album"

current_movie_is: "The currently playing movie is"

jingai commented 7 years ago

I believe this is all ready to go now, besides:

I'll hold off for a little while for the German translations of the new strings.

If anyone is capable of doing so, some testing from someone other than me would be nice as well. It requires Kodi-Voice changes from https://github.com/m0ngr31/kodi-voice/pull/3 to work.

mcl22 commented 7 years ago

Here are the translations:

nothing_playing: "There is nothing currently playing" "Im Augenblick findet keine Wiedergabe statt"

remaining_close: "It is nearly over" "Die Wiedergabe ist fast vorbei"

remaining_mins: "There are {{ minutes }} minutes remaining" "Es sind noch {{ minutes }} Minuten übrig"

remaining_min: "There is one minute remaining" "Es ist noch eine Minute übrig"

remaining_time: ", and it will end at {{ end_time }}" "und die Wiedergabe wird um {{ end_time }} enden"

unknown_playing: "The currently playing item is unknown" "Der Titel der aktuellen Wiedergabe ist unbekannt"

current_show_is: "The currently playing tv show is" "Der Name der gerade laufenden Serie ist"

current_song_is: "The currently playing song is" "Der Titel des aktuell laufenden Lieds lautet"

on_the_album: "on the album" "auf dem Album"

current_movie_is: "The currently playing movie is" "Der Titel des gerade laufenden Films lautet"

"Du hast dopamin von böhse onkelz, ..., und mehr" means "you have dopamin by böhse onkelz ... and more" and it makes sense. Although I would tweak it a little bit to: "Du hast {{ album }} von {{ artist }}, {{ album }} von {{ artist }}, ... und weitere"

jingai commented 7 years ago

"Du hast dopamin von böhse onkelz, ..., und mehr" means "you have dopamin by böhse onkelz ... and more" and it makes sense. Although I would tweak it a little bit to: "Du hast {{ album }} von {{ artist }}, {{ album }} von {{ artist }}, ... und weitere"

You mean specifically "weitere" vs "mehr", yes?

mcl22 commented 7 years ago

Yes I do.

jingai commented 7 years ago

Thanks!

I'm genuinely curious now though.. it's totally off-topic, but what makes "weitere" more appropriate in this context than "mehr?"

mcl22 commented 7 years ago

Puh, good question :) In the first place to me it sounds better. I'll try to explain it the best I can. More is very general. I could say I do have the album xyz by abc and more (mehr) ... music / albums /cds.... If I say I do have the album xyz by abc and weitere this weitere refers directly to albums. It's perhaps a bit like "I do have this album and others" which more means other albums not other artists / music or something else. I don't know if I could make it clear but at least I tried :)

jingai commented 7 years ago

Oh, hmm, we actually need a generic "more" in this case, as we use this string for all media types.

With that in mind, should I change it back?

jingai commented 7 years ago

Also, I'd really like to get a firm understanding of your other two issues before I merge these changes in.

For your issue with things like "boney m." and requests containing the character "ß", could you provide:

mcl22 commented 7 years ago

But it's always the answer to a question for a particular category, isn't it? Like what new movie / tv show / albums do I have. What I mean is it would work for things like: "Du hast {{ movie }}, {{ movie }}, ... und weitere" "Du hast {{ tv show }}, {{ tv show }}, ... und weitere" "Du hast {{ artist }}, {{ artist }}, ... und weitere" "Du hast {{ album }}, {{ album }}, ... und weitere" "Du hast {{ song }}, {{ song }}, ... und weitere"

It wouldn't work for something like (what new items do I have): "Du hast Musik von {{ artist }}, den Film {{ movie }}, ... und mehr"

jingai commented 7 years ago

Ohh.. so it's a contextual thing. It currently is used only in cases where a single media type is referenced, yes. But, in the future another programmer (or even me, if I forget all this), could use it for mixed media types. The template string currently is labeled, "and_more", and it's generally good practice to re-use these kinds of things wherever we can.

I suppose the question is: is "mehr" that off-putting to warrant making two separate strings that, in other languages (like English), fundamentally mean the same thing? This eventually becomes a point of confusion for future programmers who will work on the project. Or me.. in a few months ;)

mcl22 commented 7 years ago

I see. Well "mehr" indeed doesn't sound so great in this context but I guess good enough to work. It's not completly wrong or something and one could say it. It's probably just one of the thing why it's always said that german is a very complicated language :)

jingai commented 7 years ago

No, it's OK. I just wanted to know if it was worth adding another string, and it sounds like it is. I've just renamed it to "and_more_similar" to differentiate it :)

jingai commented 7 years ago

Post or PM me those logs and transcriptions and I can button this up :)

mcl22 commented 7 years ago

Here is the result from me asking Alexa to play "weiß". History of the Alexa app: "alexa öffne kodi und spiel weiß"

Log: [2017-05-21 11:06:27,407] ERROR in app: Exception on / [POST] Traceback (most recent call last): File "C:\Python27\lib\site-packages\flask\app.py", line 1982, in wsgi_app response = self.full_dispatch_request() File "C:\Python27\lib\site-packages\flask\app.py", line 1614, in full_dispatch_request rv = self.handle_user_exception(e) File "C:\Python27\lib\site-packages\flask\app.py", line 1517, in handle_user_exception reraise(exc_type, exc_value, tb) File "C:\Python27\lib\site-packages\flask\app.py", line 1612, in full_dispatch_request rv = self.dispatch_request() File "C:\Python27\lib\site-packages\flask\app.py", line 1598, in dispatch_request return self.view_functionsrule.endpoint File "C:\Python27\lib\site-packages\flask_ask\core.py", line 594, in _flask_view_func result = self._map_intent_to_view_func(self.request.intent)() File "C:/***/htdocs/kodi-alexa\alexa.py", line 558, in alexa_play_media heard_search = str(Movie).lower().translate(None, string.punctuation) UnicodeEncodeError: 'ascii' codec can't encode character u'\xdf' in position 3: ordinal not in range(128)

mcl22 commented 7 years ago

And for the Boney M. / generic slot problem: If Boney M. is not included in the samples for the custom slot MUSICARTISTS (and Bonnie Tyler is), the Alexa app claims I would have said "alexa öffne kodi und spiel musik von bonnie tyler".

If I add Boney M. to the MUSICARTISTS and ask the same, the app tells me "alexa öffne media und spiel musik von boney m." and it works.

mcl22 commented 7 years ago

One more thing, if I get you right you say that a slot gets generic if I do have around 200 - 300 samples in it? For the slots I do have problems with (movies, artists, songs, albums) I do have those numbers. I temporarely solved my artist problem by including all 381 artists I currently have. But of course this would again become a problem when I add a new one. And I can't do that for albums or songs because it would be thousands of entries and even though I learned from the documentation that Amazon would allow up to 50k entries it wouldn't save in my case. And there we are again at the problem with the utterances / amount of slot entries and all those Amazon secrets behind :)

jingai commented 7 years ago

Here is the result from me asking Alexa to play "weiß".

This should be fixed with this branch.

If Boney M. is not included in the samples for the custom slot MUSICARTISTS

Aye, it works like this because you've weighted it in favor of a similarly-sounding name. I know you might think it doesn't sound similar, but it can to a computer :)

I still need a copy/paste of the transcription of what Alexa heard, though. I don't have Boney M. in my slot, but I do have Bonnie Tyler there, and Alexa still passes "boney m." to my skill. I suspect this is down to a difference in how we pronounce "boney m," but I need to see your transcription to verify that.

I think there is some confusion about what I mean by the transcription, so I'll try to be more clear: at alexa.amazon.com or using the Alexa app on your phone, you need to go to Settings and at the bottom of the list there, there's a button that says History. In there, you'll see a listing of everything you've said to Alexa, verbatim. There will be nothing else in this section. If you click on an item, it will play back the audio snippet of you talking to her.

One more thing, if I get you right you say that a slot gets generic if I do have around 200 - 300 samples in it? For the slots I do have problems with (movies, artists, songs, albums) I do have those numbers. I temporarely solved my artist problem by including all 381 artists I currently have. But of course this would again become a problem when I add a new one.

I don't know what to tell you differently than what I've already said, unfortunately. Yes, it's going to be better if you 'help' her out by telling her exactly what you have up-front. No, it won't necessarily be a problem when you add a new artist -- just because the fuzzy matching isn't 'perfect' doesn't mean it's 'bad'.

If you find she has trouble recognizing something, you add it to the appropriate slot. In general though, you don't need to do this if she's hearing what you mean correctly.

And I can't do that for albums or songs because it would be thousands of entries and even though I learned from the documentation that Amazon would allow up to 50k entries it wouldn't save in my case. And there we are again at the problem with the utterances / amount of slot entries and all those Amazon secrets behind :)

We are keenly aware of this unfortunate fact. But as I said on the forums, until we have a context-aware matcher (a list of potential alternate titles for media, like "tupac", "2pac", etc), the skill is trying to match against a lot of items blindly, and there's not a lot we can do about it.

I am hoping Amazon's built-in library intents help us here, but unfortunately they are marked "developer preview" which means they're only available in the US at the moment.

mcl22 commented 7 years ago

I think I was in the right place for the transcription. The place in settings where I could listen to the audio of what I said and where I could delete it. But it always tells exactly what also is written on the cards.

I also do understand that for a computer it could sound somehow similar but here is another example of me trying to play a movie that I tried just a minute ago.

I said "Alexa, öffne Kodi und zeige Film Mama" Alexa did understand "alexa öffne kodi und zeige film mr holmes"

I tried it several times: Alexa did understand "alexa öffne kodi und zeige film matrix" Alexa did understand "alexa öffne kodi und zeige film noah" Alexa did understand "alexa öffne kodi und zeige film man of steel" Alexa did understand "alexa öffne kodi und zeige film nerve"

And in one case she logged: "alexa öffne kodi zeige film mama" and did nothing. I mean she did do this short sound like canceling something and that's it.

I mean mama and matrix ... hm ok although the x should make a great difference. But mama and man of steel or nerve? And believe me the sound of nerve and mama is at least as different in german as it is in english :) And for man of steel or mr holmes not even the amount of syllables is the same. Of course those movies she played are all in the samples :) I don't want to bother you. At the moment to me it just seems that she first compares against the samples, tries to match and then writes the log. That's also because I know those logs when she really didn't understand the right thing. Not when trying to use the kodi skill but in other cases. And there she writes partly funny stuff with a lot of grammer mistakes and so. But here she always wriites it really clear but with the one mistake that it's the wrong movie :)

jingai commented 7 years ago

And for the Boney M. / generic slot problem: If Boney M. is not included in the samples for the custom slot MUSICARTISTS (and Bonnie Tyler is), the Alexa app claims I would have said "alexa öffne kodi und spiel musik von bonnie tyler".

Assuming you're looking in the right place.. that is.. odd. I don't think the slot items should have any impact on the initial transcribing process. It doesn't appear to for me.

But, I notice you also request it a bit differently than I do, and I wonder if that's got something to do with it. Instead of "Alexa, open Kodi and", could you try it with "Alexa, ask Kodi to" ?

jingai commented 7 years ago

I mean mama and matrix ... hm ok although the x should make a great difference. But mama and man of steel or nerve? And believe me the sound of nerve and mama is at least as different in german as it is in english :) And for man of steel or mr holmes not even the amount of syllables is the same. Of course those movies she played are all in the samples :) I don't want to bother you. At the moment to me it just seems that she first compares against the samples, tries to match and then writes the log.

It's fine to bother me. This isn't how it's supposed to work. Something really weird is going on for her to match "mama" to some of those other things. The skill is designed around the expectation that she won't just match anything at all, but will instead push through what she heard verbatim to the skill. But you're saying she's hearing it like "man of steel" which is... weird.

I just tried "Alexa, ask Kodi to play the movie Mama" here and she heard:

alexa ask kodi to play the movie mama

She hears me correctly if I say, "Alexa, open Kodi and play the movie Mama" too. But maybe you can try with the equivalent to ask/tell and see if it changes anything?

jingai commented 7 years ago

...it really does sound like your slots are not getting converted to "generic" status, even though you have enough items in them. I wonder if there's something wrong with the German version of Alexa currently..?

mcl22 commented 7 years ago

I did but it's not that easy in german because in this case "to ask" would be translated with "bitte" and that is as far as I know none of the words allowed like "öffne / starte / frage / ...". But I tried it by adding a new utterance, too ("ListenToArtist {Artist} zu spielen") because almost all german utterances are built for sentences like "open/start kodi and ....". However this change didn't do it. Once she again played Bonnie Tyler and in six more tries she "canceled" and did nothing but log "alexa frage kodi boney m. zu spielen". Six tries and six times the exact same log and behavior. That's why I think that she could and would understand me right actually.

jingai commented 7 years ago

I'm really not sure what to do then. German Alexa definitely doesn't seem to be doing what we expect, and it definitely isn't doing this on my copies of the skill here :(

@ausweider and @sveni-lee do you see this too?

mcl22 commented 7 years ago

I found some new unicode errors :) I don't really get it but when testing the utterances I tried the WatchEpisode intend. It works for almot all tv shows I have. But there seems to be a wired problem with the show "4400 - Die Rückkehrer".

If I ask Alexa to play it by saying the number four thousand four hundred die rückkehrer she logs (in the app history) "Konnte die Serie vier tausend vier hundert die ruckkehrer nicht finden" which means she couldn't find it. Then I tried to say four four zero zero die rückkehrer and she failed. The server log says: [2017-05-21 20:33:29,128] ERROR in app: Exception on / [POST] Traceback (most recent call last): File "C:\Python27\lib\site-packages\flask\app.py", line 1982, in wsgi_app response = self.full_dispatch_request() File "C:\Python27\lib\site-packages\flask\app.py", line 1614, in full_dispatch_request rv = self.handle_user_exception(e) File "C:\Python27\lib\site-packages\flask\app.py", line 1517, in handle_user_exception reraise(exc_type, exc_value, tb) File "C:\Python27\lib\site-packages\flask\app.py", line 1612, in full_dispatch_request rv = self.dispatch_request() File "C:\Python27\lib\site-packages\flask\app.py", line 1598, in dispatch_request return self.view_functionsrule.endpoint File "C:\Python27\lib\site-packages\flask_ask\core.py", line 594, in _flask_view_func result = self._map_intent_to_view_func(self.request.intent)() File "C:/***/htdocs/kodi-alexa\alexa.py", line 1845, in alexa_watch_episode heard_show = str(Show).lower().translate(None, string.punctuation) UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 25: ordinal not in range(128)

And for WatchLatestEpisode I get another error. If i tell Alexa to play eg season 1 episode 1 from the x-files it works. The german title of the show is "Akte X - Di eunheimlichen Fälle des FBI". So I tell her to play "Staffel 1 Episode 1 von Akte X die unheimlichen Fälle des FBI". It's all ok. But if I try it with the latest episode and say "Zeige letzte Episode von Akte X - die unheimlichen Fälle des FBI" it fails with the following error: [2017-05-21 20:43:02,984] ERROR in app: Exception on / [POST] Traceback (most recent call last): File "C:\Python27\lib\site-packages\flask\app.py", line 1982, in wsgi_app response = self.full_dispatch_request() File "C:\Python27\lib\site-packages\flask\app.py", line 1614, in full_dispatch_request rv = self.handle_user_exception(e) File "C:\Python27\lib\site-packages\flask\app.py", line 1517, in handle_user_exception reraise(exc_type, exc_value, tb) File "C:\Python27\lib\site-packages\flask\app.py", line 1612, in full_dispatch_request rv = self.dispatch_request() File "C:\Python27\lib\site-packages\flask\app.py", line 1598, in dispatch_request return self.view_functionsrule.endpoint File "C:\Python27\lib\site-packages\flask_ask\core.py", line 594, in _flask_view_func result = self._map_intent_to_view_func(self.request.intent)() File "C:/***/htdocs/kodi-alexa\alexa.py", line 1878, in alexa_watch_next_episode heard_show = str(Show).lower().translate(None, string.punctuation) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 25: ordinal not in range(128)

And one last thing btw. I don't know if it has something to do perhaps with all those problems I run into but the german utterances seem be at least a bit difficult. Well some of them. Eg in a lot of cases when Alexa doesn't do the right thing she just starts party play. Even if the action she didn't understand right was eg to play a movie - at least nothing containing party or something. Also the WhatAlbums often ends up in playing an album and so on. Wll, I don't know. Currently every time I try to test something I seem to run into another thing :) So whatever :)

jingai commented 7 years ago

The unicode errors should all be fixed on this branch. I will merge all of this in tomorrow and you can tell me if that is in fact the case.

As for the utterances.. yeah.. getting them right is a bit of a balancing act. I've put a lot of thought into avoiding conflicts while still allowing a wide variety of ways to say things in the English utterances. Unfortunately, it's very difficult for me to give the German utterances the same treatment since I don't speak German..

jingai commented 7 years ago

Ready for merge. Needs Kodi-Voice 0.8.0 in PyPi before it will work.

mcl22 commented 7 years ago

So far this looks pretty good for the umlauts. But I found one thing :) When generating the new samples some files now contained umlauts (as intended). But for the MOVIEGENRES there was a ö instead of the right umlaut. This is for the genre "Komödie" (comedy). So I corrected it since it was the only item I found that was wrong. But when trying to ask Alexa for comedy movies I again get an unicode error in the logs: Neu hinzugef\xc3\xbcgte Filme Sending request to http://127.0.0.1:8081/jsonrpc from device amzn1.ask.device.AH* [2017-05-22 18:46:39,976] ERROR in app: Exception on / [POST] Traceback (most recent call last): File "C:\Python27\lib\site-packages\flask\app.py", line 1982, in wsgi_app response = self.full_dispatch_request() File "C:\Python27\lib\site-packages\flask\app.py", line 1614, in full_dispatch_request rv = self.handle_user_exception(e) File "C:\Python27\lib\site-packages\flask\app.py", line 1517, in handle_user_exception reraise(exc_type, exc_value, tb) File "C:\Python27\lib\site-packages\flask\app.py", line 1612, in full_dispatch_request rv = self.dispatch_request() File "C:\Python27\lib\site-packages\flask\app.py", line 1598, in dispatch_request return self.view_functionsrule.endpoint File "C:\Python27\lib\site-packages\flask_ask\core.py", line 594, in _flask_view_func result = self._map_intent_to_view_func(self.request.intent)() File "C:/*/htdocs/kodi-alexa\alexa.py", line 2074, in alexa_what_new_movies response_text = render_template('you_have_list', items=movie_list).encode("utf-8").encode("utf-8") UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 35: ordinal not in range(128)

The utterance was: "WhatNewMovies ob es neue {Genre} Filme gibt" Alexa app: "alexa frage kodi ob es neue komödie filme gibt"

So I thought it was because of umlaut but when I tried horror movies. This time she didn't understand me right. She understood ""alexa frage kodi was neue horror filme gibt". Though it's not so far from the truth it ended up in an error. An unicode error again although there is no umlaut in "horror". The log sais: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 113: ordinal not in range(128) When she understands me correctly I get the answer I would expect.

But that's not all :) I tried action movies. And this time she understood me right: "alexa frage kodi ob es neue action filme gibt". But again I get an unicode error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xed in position 29: ordinal not in range(128)

For comic movies I get: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 84: ordinal not in range(128)

For abenteuer (adventure): UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 74: ordinal not in range(128)

Musical movies: UnicodeDecodeError: 'ascii' codec can't decode byte 0xed in position 8: ordinal not in range(128)

But while writing this I've just recognized something. If I take this utterance "WhatNewMovies ob du neue {Genre} Filme hast" instead of the one above I also get an unicode error for horror. So it also has something to do with the utterances (again)?! And the genre actually is being ignored. Except you would say "La La Land" is horror :)

And even if I take "WhatNewMovies welcher Film wartet" without {genre} I get an unicode error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 49: ordinal not in range(128)

jingai commented 7 years ago

So far this looks pretty good for the umlauts. But I found one thing :) When generating the new samples some files now contained umlauts (as intended). But for the MOVIEGENRES there was a ö instead of the right umlaut. This is for the genre "Komödie" (comedy).

Did you use the python generator (generate_custom_slots.py) or the web-based generator?

For the WhatNewMovies issue.. try replacing alexa.py with this one and redeploy.

Also, does WhatAlbums work? And WhatNewAlbums?

mcl22 commented 7 years ago

Btw the german translations work great! There's only one little thing I discovered :) If asking for currently playing tv show Alexa sais something like " ... season X episode Y title". The season is called "Staffel" in german. Episode is ok because it's the same in german. But perhaps you can change season to Staffel.

mcl22 commented 7 years ago

I used generate_custom_slots.py.

mcl22 commented 7 years ago

WhatAlbums does work, WhatNewAlbums doesn't. There she tells me about new episodes of tvshows or what's playing currently.

jingai commented 7 years ago

How did you ask for WhatAlbums and WhatNewAlbums?

mcl22 commented 7 years ago

With the new alexa.py I don't get any unicode errors so far. And also the {genre does work} but only if I really speak more than clearly and word by word. Don't know if you know what I mean but it like syllable for syllable :) And if Alexa get's it right she logs in the app "alexa frage kodi ob wir einen neuen musical film haben" even though I used "ob wir neue {Genre} Filme haben".

I'm not complaining. I only once again have the feeling that in the history she writes something after doing her own matching. There is also an utterance "ob wir einen neuen {Genre} Film haben".

Actually that would all be ok if she didn't ignore the {genre} if there's only one or two wrong characters. This isn't like she normally behaves. Eg "alexa frage kodi ob wir neuen musical film haben". I think the utterances in this scenario are too similar and she takes "WhatNewMovies ob wir neue Filme haben" instead. Or something similar. Does it actually make sense to have words like "ein/e" in the utterances? This means "a" and I thought words like "a" and "the" could always be added without explicitly called in a utterance? But I#M not sure about that.

jingai commented 7 years ago

Try all of these utterances please:

welche Alben du von {Artist} hast
welche welches neue Album wartet
welche neuen Filme warten
welche Komödie Filme warten
welche neue Serie wartet

As for articles ("ein/e" etc), I've found that they are only necessary if that utterance also includes a slot reference. I don't really know why, but that's been my experience.

mcl22 commented 7 years ago

I used "WhatAlbums welche Alben ich von {Artist} habe" and for WhatNewAlbums I tried: WhatNewAlbums welche Alben warten WhatNewAlbums welche CD fertig ist WhatNewAlbums welche CDs warten WhatNewAlbums welches Album fertig ist WhatNewAlbums welches neue Album wartet

mcl22 commented 7 years ago

welche Alben du von {Artist} hast ... works welche Alben auf dem mediaserver sind ... ends up in telling me about new tv show episodes welche neuen Filme warten ... works welche Komödie Filme warten ... tells recently added movies not comedy movies welche neue Serie wartet ... works

jingai commented 7 years ago

welche Alben auf dem mediaserver sind ... ends up in telling me about new tv show episodes welche Komödie Filme warten ... tells recently added movies not comedy movies

Try alternates until you find something that executes the right handler. I'm only trying to clean up the unicode errors at the moment -- #163 is more relevant for the issue of her misunderstanding what to do.

jingai commented 7 years ago

There's only one little thing I discovered :) If asking for currently playing tv show Alexa sais something like " ... season X episode Y title". The season is called "Staffel" in german. Episode is ok because it's the same in german. But perhaps you can change season to Staffel.

I simply missed these two. I've made "season" == "Staffel" and "episode" == "Folge". Is that OK?

jingai commented 7 years ago

So far this looks pretty good for the umlauts. But I found one thing :) When generating the new samples some files now contained umlauts (as intended). But for the MOVIEGENRES there was a ö instead of the right umlaut. This is for the genre "Komödie" (comedy).

Did you use the python generator (generate_custom_slots.py) or the web-based generator?

I used generate_custom_slots.py.

Could you give the web-based slot generator a try and see if it has the same problem please?

mcl22 commented 7 years ago

Folge is ok, Episode is, too. Your choice :) I'll try the web based slot generator tomorrow as I'm trying to build some own utterances using makermusings.com right now. I'll tell you my results.

mcl22 commented 7 years ago

Anyway, I've tried the web based generator and it works for Komödie and I didn't see any other error.

jingai commented 7 years ago

Merged in the fixes for the unicode errors in WhatNew* intents and the translations for 'season' and 'episode'.

I'll look at the slot generator as soon as I get a chance.

jingai commented 7 years ago

@mcl22, can you try looking at the MOVIEGENRES file (from generate_custom_slots.py) from within a web browser?

I think your problem with Komödie is due to you viewing the file with an application that is assuming Windows-1252 (CP1252) encoding rather than UTF-8.

(ö = ö in CP1252, and it took me way longer than I'd like to admit to figure this out)

mcl22 commented 7 years ago

I don't want to disappoint you but that's not the case :) I always use PSPad as editor and it shows utf8 for the file. The other files eg MOVIES also contain umlauts and they are displayed ok. And finally looking at the file with chrome shows "Komödie".

jingai commented 7 years ago

Try this generate_custom_slots.py and let me know.

mcl22 commented 7 years ago

Nope, same result. In MOVIEGENRES the "ö" is replaced by "ö". In eg MOVIES "ö" is displayed as "ö".

jingai commented 7 years ago

Wait.. what? The same character is displayed OK in MOVIES?