hearchco / agent

Agent for Hearchco built using Go.
https://hearch.co
GNU Affero General Public License v3.0
19 stars 1 forks source link

Google & Startpage english only results #326

Closed GGLVXD closed 2 weeks ago

GGLVXD commented 1 month ago

server can be located outside of english speaking country which means results will be non english

aleksasiriski commented 1 month ago

This should somehow be incorporated within locale option, but I'm currently working on a refactor of how engines return the results to the router so if this is a bug you've encountered it should be fixed after the refactor.

I will let you know when the refactor is done so we can further discuss how to handle this. The priority is to make everything configurable, so the english results shouldn't be hardcoded in the url of any engine.

aleksasiriski commented 3 weeks ago

The refactor is done, we can now discuss this bug further. Google already has the mentioned params set correctly, but for Startpage we need to find the correct ones for locale.

aleksasiriski commented 3 weeks ago

As I see for Startpage, we need to create a map of our locales to theirs: image

After that, we can add the locale param language for Startpage.

Currently Hearchco support any locale in format like "en_US", so to create the map for startpage you would have to map the first two chars of Locale to each of the Startpage's supported languages, and if the wanted local isn't supported then it should fallback to the default locale in Hearchco (en_US, which gets mapped to Startpage's english).

For the fallback code logic, you can take a look at Qwant.

For example, the code would like something like this (you just need to pass the first two chars of Locale to the map):

spLangs := map[string]string{
        "af": "afrikaans",
        "ar": "arabic",
        "am": "amharic",
        "az": "azerbaijani",
        "id": "indonesian",
        "ms": "malay",
        "bg": "bulgarian",
        "bn": "bengali",
        "jv": "javanese",
        "su": "sudanese",
        "bs": "bosnian",
        "be": "belarusian",
        "ca": "catalan",
        "cs": "czech",
        "cy": "welsh",
        "da": "dansk",
        "de": "deutsch",
        "et": "estonian",
        "el": "greek",
        "en": "english",
        "es": "espanol",
        "eo": "esperanto",
        "eu": "basque",
        "fa": "persian",
        "tl": "tagalog",
        "fo": "faroese",
        "fr": "francais",
        "fy": "frisian",
        "ga": "irish",
        "gd": "gaelic",
        "gl": "galician",
        "gu": "gujarati",
        "hi": "hindi",
        "hr": "croatian",
        "ia": "interlingua",
        "xh": "xhosa",
        "zu": "zulu",
        "is": "icelandic",
        "it": "italiano",
        "he": "hebrew",
        "kn": "kannada",
        "ka": "georgian",
        "sw": "swahili",
        "la": "latin",
        "lv": "latvian",
        "lt": "lithuanian",
        "hu": "hungarian",
        "bh": "bihari",
        "mk": "macedonian",
        "ml": "malayalam",
        "mt": "maltese",
        "mr": "marathi",
        "nl": "nederlands",
        "ne": "nepali",
        "no": "norsk",
        "uz": "uzbek",
        "oc": "occitan",
        "th": "thai",
        "pl": "polski",
        "pt": "portugues",
        "pa": "punjabi",
        "ro": "romanian",
        "ru": "russian",
        "sq": "albanian",
        "si": "sinhalese",
        "sk": "slovak",
        "sl": "slovenian",
        "sr": "serbian",
        "fi": "suomi",
        "sv": "svenska",
        "ta": "tamil",
        "te": "telugu",
        "vi": "vietnamese",
        "ti": "tigrinya",
        "tr": "turkce",
        "uk": "ukrainian",
        "ur": "urdu",
        "ko": "hangul",
        "zh-cn": "jiantizhongwen",
        "ja": "nihongo",
        "zh-tw": "fantizhengwen",
    }
aleksasiriski commented 6 days ago

If you want to implement proper Startpage locale as described above, please feel free to reopen this. Don't forget to merge the latest changes (especially after the refactor). Also, since we already support Google's locale settings, the changes to it shouldn't be included in this PR. If Google still gets you the wrong results open a issue or a separate PR for it to fix the bug.