Closed leranp closed 9 months ago
Hey there @balloob, @synesthesiam, mind taking a look at this issue as it has been labeled with an integration (assist_pipeline
) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
assist_pipeline documentation assist_pipeline source (message by IssueLinks)
This happens because the available languages are those that are supported by the conversational model (I think that's the intents repository), the supported languages of a speech-to-text service (if one exists) and the supported languages of a text-to-speech service (default configuration has Google Translate). So, it takes all of the languages from all services and intersects them with each other and whatever's left is show in the drop down list.
The problem is that the conversational model uses he
for Hebrew while the Google TTS us iw
for Hebrew so they don't make the cut at the end.
I have two suggestions for fixing this, not sure what the best approach would be. The first is to "normalize" the Hebrew language code:
diff --git a/homeassistant/util/language.py b/homeassistant/util/language.py
index 4ec8c74ffa..9882210dc9 100644
--- a/homeassistant/util/language.py
+++ b/homeassistant/util/language.py
@@ -87,6 +87,12 @@ class Dialect:
# Languages are lower-cased
self.language = self.language.casefold()
+ # Normalize language name
+ for language_names in SAME_LANGUAGES:
+ if self.language in language_names:
+ self.language = language_names[0]
+ break
+
if self.region is not None:
# Regions are upper-cased
self.region = self.region.upper()
And the second is a bit more generic by using existing methods to compare if two languages are the same and using that for the intersection:
diff --git a/homeassistant/components/assist_pipeline/websocket_api.py b/homeassistant/components/assist_pipeline/websocket_api.py
index bd2ec53db4..7d68e2910e 100644
--- a/homeassistant/components/assist_pipeline/websocket_api.py
+++ b/homeassistant/components/assist_pipeline/websocket_api.py
@@ -314,7 +314,7 @@ async def websocket_list_languages(
dialect = language_util.Dialect.parse(language_tag)
languages.add(dialect.language)
if pipeline_languages is not None:
- pipeline_languages &= languages
+ pipeline_languages = language_util.intersect(pipeline_languages, languages)
else:
pipeline_languages = languages
@@ -324,7 +324,7 @@ async def websocket_list_languages(
dialect = language_util.Dialect.parse(language_tag)
languages.add(dialect.language)
if pipeline_languages is not None:
- pipeline_languages &= languages
+ pipeline_languages = language_util.intersect(pipeline_languages, languages)
else:
pipeline_languages = languages
diff --git a/homeassistant/util/language.py b/homeassistant/util/language.py
index 4ec8c74ffa..9882210dc9 100644
--- a/homeassistant/util/language.py
+++ b/homeassistant/util/language.py
@@ -199,3 +205,14 @@ def matches(
# Score < 0 is not a match
return [tag for _dialect, score, tag in scored if score[0] >= 0]
+
+def intersect(
+ set1: Iterable[str], set2: Iterable[str]
+) -> set[str]:
+ """Return the intersection of two language sets taking into consideration name variations."""
+ languages = set()
+ for language in set1:
+ matching_languages = matches( language, set2 )
+ if len(matching_languages) > 0:
+ languages.add(matching_languages[0])
+ return languages
The latter, though more generic, might return either he
or iw
for Hebrew, depending on the order intersection and I don't know how that might affect things later on in the pipeline.
@synesthesiam or @emontnemery, as most of the related code here is yours, I'd be happy to get your feedback before opening a pull request.
Thanks!
This happens because the available languages are those that are supported by the conversational model (I think that's the intents repository), the supported languages of a speech-to-text service (if one exists) and the supported languages of a text-to-speech service (default configuration has Google Translate). So, it takes all of the languages from all services and intersects them with each other and whatever's left is show in the drop down list.
The problem is that the conversational model uses
he
for Hebrew while the Google TTS usiw
for Hebrew so they don't make the cut at the end.I have two suggestions for fixing this, not sure what the best approach would be. The first is to "normalize" the Hebrew language code:
diff --git a/homeassistant/util/language.py b/homeassistant/util/language.py index 4ec8c74ffa..9882210dc9 100644 --- a/homeassistant/util/language.py +++ b/homeassistant/util/language.py @@ -87,6 +87,12 @@ class Dialect: # Languages are lower-cased self.language = self.language.casefold() + # Normalize language name + for language_names in SAME_LANGUAGES: + if self.language in language_names: + self.language = language_names[0] + break + if self.region is not None: # Regions are upper-cased self.region = self.region.upper()
And the second is a bit more generic by using existing methods to compare if two languages are the same and using that for the intersection:
diff --git a/homeassistant/components/assist_pipeline/websocket_api.py b/homeassistant/components/assist_pipeline/websocket_api.py index bd2ec53db4..7d68e2910e 100644 --- a/homeassistant/components/assist_pipeline/websocket_api.py +++ b/homeassistant/components/assist_pipeline/websocket_api.py @@ -314,7 +314,7 @@ async def websocket_list_languages( dialect = language_util.Dialect.parse(language_tag) languages.add(dialect.language) if pipeline_languages is not None: - pipeline_languages &= languages + pipeline_languages = language_util.intersect(pipeline_languages, languages) else: pipeline_languages = languages @@ -324,7 +324,7 @@ async def websocket_list_languages( dialect = language_util.Dialect.parse(language_tag) languages.add(dialect.language) if pipeline_languages is not None: - pipeline_languages &= languages + pipeline_languages = language_util.intersect(pipeline_languages, languages) else: pipeline_languages = languages diff --git a/homeassistant/util/language.py b/homeassistant/util/language.py index 4ec8c74ffa..9882210dc9 100644 --- a/homeassistant/util/language.py +++ b/homeassistant/util/language.py @@ -199,3 +205,14 @@ def matches( # Score < 0 is not a match return [tag for _dialect, score, tag in scored if score[0] >= 0] + +def intersect( + set1: Iterable[str], set2: Iterable[str] +) -> set[str]: + """Return the intersection of two language sets taking into consideration name variations.""" + languages = set() + for language in set1: + matching_languages = matches( language, set2 ) + if len(matching_languages) > 0: + languages.add(matching_languages[0]) + return languages
The latter, though more generic, might return either
he
oriw
for Hebrew, depending on the order intersection and I don't know how that might affect things later on in the pipeline.@synesthesiam or @emontnemery, as most of the related code here is yours, I'd be happy to get your feedback before opening a pull request.
Thanks!
This PR is going to handle the 2 option codes https://github.com/home-assistant/core/pull/93681 but i am not sure if this will fix the Hebrew selection.
@leranp, unfortunately that doesn't help with this specific issue but you'll notice that I am relying on it to correlate between the two versions
Is there any workaround for this issue until it's properly fixed?
@tidharmor You can try to apply manually one of the above suggestions. I haven't opened a pull request for it since I don't know which is the preferred method. I suggest starting with the first one since it's less likely to cause other issues
@shmuelzon I'm a developer, but haven't played around with Home Assistant development yet. I'm running HAOS, is it possible to apply this fix in this environment or do I have to set up a development environment?
Thanks
@tidharmor I've never setup a development environment either :)
I don't use HAOS, I use the Docker installation and I just run /bin/bash
in the HA docker instance, modify the files and restart HA.
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
The problem
When choosing the Language from the list, there is no Hebrew
What version of Home Assistant Core has the issue?
core-2023.5.3
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant Container
Integration causing the issue
assist_pipeline
Link to integration documentation on our website
https://www.home-assistant.io/integrations/assist_pipeline/
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
No response
Additional information
No response