bahnzumberg / zuugle-api

Public repositority for backend components of www.zuugle.at
https://www.zuugle.at
GNU General Public License v3.0
2 stars 3 forks source link

Multi language search backend #10

Closed martinheppner closed 10 months ago

martinheppner commented 1 year ago

Union of three SQLs:

  1. menu_lang: First filters on menu_lang=tours.text_lang and multiply the search weight multiplied with 100
  2. TLD derived: filter on all languages derived from the TLD excluding the one already handled in union 1 (menu_lang). Search_weight multiplied with 10
  3. All the rest: Every language not handled above, will be used in the query now

Over this union, there has to be a query, which is doing the ORDER BY and the LIMIt & OFFSET

martinheppner commented 1 year ago

First task is to collect all use cases / situations.

Falsal commented 1 year ago

@martinheppner : collected the use cases in a google sheet, check your chat

martinheppner commented 1 year ago

SELECT o."id", o."url", o."provider", o."hashed_url", o."description", o."image_url", o."ascent", o."descent", o."difficulty", o."difficulty_orig", o."duration", o."distance", o."title", o."type", o."children", o."number_of_days", o."traverse", o."country", o."state", o."range_slug", o."range", o."season", o."month_order", o."country_at", o."country_de", o."country_it", o."country_ch", o."country_si", o."country_fr", o."publishing_date", o."quality_rating", o."user_rating_avg", o."cities", o."cities_object", o."max_ele" FROM (

SELECT i1., ts_rank(i1.search_column, websearch_to_tsquery('english', '"schneeberg" schneeberg:') ) {rank_en} as result_rank FROM tour AS i1 WHERE i1.text_lang='en' AND i1."country_at" = true and i1.cities @> '[{"city_slug": "wien"}]'::jsonb AND i1.search_column @@ websearch_to_tsquery('english', '"schneeberg" schneeberg:')

UNION

SELECT i2., ts_rank(i2.search_column, websearch_to_tsquery('german', '"schneeberg" schneeberg:') ) {rank_de} as result_rank FROM tour AS i2 WHERE i2.text_lang='de' AND i2."country_at" = true and i2.cities @> '[{"city_slug": "wien"}]'::jsonb AND i2.search_column @@ websearch_to_tsquery('german', '"schneeberg" schneeberg:')

UNION

SELECT i3., ts_rank(i3.search_column, websearch_to_tsquery('french', '"schneeberg" schneeberg:') ) {rank_fr} as result_rank FROM tour AS i3 WHERE i3.text_lang='fr' AND i3."country_at" = true and i3.cities @> '[{"city_slug": "wien"}]'::jsonb AND i3.search_column @@ websearch_to_tsquery('french', '"schneeberg" schneeberg:')

UNION

SELECT i4., ts_rank(i4.search_column, websearch_to_tsquery('italian', '"schneeberg" schneeberg:') ) {rank_it} as result_rank FROM tour AS i4 WHERE i4.text_lang='it' AND i4."country_at" = true and i4.cities @> '[{"city_slug": "wien"}]'::jsonb AND i4.search_column @@ websearch_to_tsquery('italian', '"schneeberg" schneeberg:')

UNION

SELECT i5., ts_rank(i5.search_column, websearch_to_tsquery('simple', '"schneeberg" schneeberg:') ) {rank_sl} as result_rank FROM tour AS i5 WHERE i5.text_lang='sl' AND i5."country_at" = true and i5.cities @> '[{"city_slug": "wien"}]'::jsonb AND i5.search_column @@ websearch_to_tsquery('simple', '"schneeberg" schneeberg:')

UNION

) AS o

order by o."month_order" asc, o.result_rank DESC, o.traverse DESC, FLOOR((o.cities_object->'wien'->>'best_connection_duration')::int/30)*30 ASC, o.ID % date_part('day', NOW() )::INTEGER ASC

limit 9 offset 9

Falsal commented 1 year ago

Up until now I was able to achieve the functionality to create the above query as it is in a general format, i.e. it is one function that creates all of the above query (looping over different arrays that were created for this purpose) The challenge is to integrate this function inside the listWrapp that creates the complete query (with other parameters including filter params).

martinheppner commented 1 year ago

The inner 5 SQLs are always rhere. The weighting value changes. The city and the search term changes. Everything else stays rhe same in the inner sql.

The outer sql is the one you are doing the filtering with (just like in the first version). The sorting is done in the outer sql as well.

I do not see the need of looping through an array, but maybe I am wrong.

Falsal commented 1 year ago

from your comment above : _The outer sql is the one you are doing the filtering with (just like in the first version). The sorting is done in the outer sql as well._ Do you mean that the outer sql , actually remains the same as before ? and so does the sorting that comes afterwards?

isnt this the outer sql ? SELECT o."id", o."url", o."provider", ........ ........ FROM (

it maybe some small issue that i haven't caught yet, so can we/ should we use the "outer sql" already generated by the function listWrapp ? just as a reminder , by the time it reaches the search section were we should be inserting new search related code, the query looks like this:

select "id", "url", "provider", "hashed_url", "description", "image_url", "ascent", "descent", "difficulty", "difficulty_orig", "duration", "distance", "title", "type", "children", "number_of_days", "traverse", "country", "state", "range_slug", "range", "season", "month_order", "country_at", "country_de", "country_it", "country_ch", "country_si", "country_fr", "publishing_date", "quality_rating", "user_rating_avg", "cities", "cities_object", "max_ele" from "tour" where "country_at" = true and cities @> '[{"city_slug": "wien"}]'::jsonb

then this "" AND " : ""}search_column @@ websearch_to_tsquery('german', '"${_search}" ${_search}:*')` " is added to it : so .. should we add the currently generated "outer" query as is ? modify it ?

or should we create a completely new version of it ?

this is what i would like to talk to you about tomorrow.

martinheppner commented 1 year ago

We will still have to sent two queries. In Metacode like this:

query_count = knex.raw("SELECT count(*)" + sql_from + sql_where); query_result = knex.raw(sql_select + sql_from + sql_where + sql_order);

Falsal commented 1 year ago

Code is in branch dev-drop-slovenia-issue#10