Open gacekk opened 1 month ago
Hi,
Can you give more context on the outcomes
Some of the homes generated in the dataset are using other languages e.g. https://github.com/allenporter/home-assistant-datasets/blob/main/datasets/devices/casa-del-sol-es.yaml
There are two things we can do: (1) Using the crowdsourced home assistant intents data e.g. https://github.com/home-assistant/intents/blob/main/tests/pl/climate_HassClimateGetTemperature.yaml and converting to the format here https://github.com/allenporter/home-assistant-datasets/tree/main/datasets/intents (2) Modify the generation notebooks to ask for sentences in specific languages: https://github.com/allenporter/home-assistant-datasets/blob/main/generation/device-actions.ipynb
What is the specific use case your working on? Which dataset are you loooking for?
Hi
I understand that this is all based on data in English. How about other languages?
Most LLM are predominantly trained using English data and have limited support for other languages.