Scalingo / documentation

Scalingo Documentation Center
https://doc.scalingo.com
9 stars 72 forks source link

Improve search #1713

Open btrd opened 2 years ago

btrd commented 2 years ago

Currently the search feature is working but can be improve.

Example: When searching for dbclient-fetcher this paragraph should be the top result https://doc.scalingo.com/platform/databases/access#manually-install-the-databases-cli-in-one-off

Currently it's not even in the result list

CF #1660

Frzk commented 2 years ago

I don't know how the search engine works, and I don't even know how the documentation CMS works, but, as a starting point, I suggest to add dbclient-fetcher to the list of tags for this page.

btrd commented 2 years ago

@Frzk Actually I think the tags aren't used, just tested it locally. Another ticket to open 😅

aurelien-reeves-scalingo commented 1 year ago

Knowledge sharing

Anouchka-M commented 1 year ago

We thought it may be code html tag that wasn't well referenced (markdown inline code), but it's not the case.

If we search composer.json, https://doc.scalingo.com/languages/php/ips-filtering#deployment-process-configuration appears in the search results, with the following sentence :

This process requires you to edit the composer.json file of your project.

Screenshot from 2023-07-12 11-28-06

Anouchka-M commented 1 year ago

Sent a message to customer support, but they probably won't answer as it's a free product.

https://support.algolia.com/hc/en-us/requests/569425

Anouchka-M commented 1 year ago

After spending a few hours on the subject with Jean, from my perspective it probably takes changes in configuration, and a deeper knowledge of algolia.

The configuration of algolia is not easy, so it'll need time to solve the issue, or someone who has a good knowledge of algolia and the crawler.

What we figured is that it's well referenced : https://dashboard.algolia.com/apps/RWJM2H1BD2/explorer/browse/scalingo-doc?query=15-https%3A%2F%2Fdoc.scalingo.com%2Fplatform%2Fdatabases%2Faccess&searchMode=objectID

{
  "version": "",
  "tags": [],
  "url": "https://doc.scalingo.com/platform/databases/access#manually-install-the-databases-cli-in-one-off",
  "url_without_variables": "https://doc.scalingo.com/platform/databases/access#manually-install-the-databases-cli-in-one-off",
  "url_without_anchor": "https://doc.scalingo.com/platform/databases/access",
  "anchor": "manually-install-the-databases-cli-in-one-off",
  "content": "You can use the script dbclient-fetcher to download and install it. Using this script is\nas simple as:",
  "content_camel": "You can use the script dbclient-fetcher to download and install it. Using this script is\nas simple as:",
  "lang": "en",
  "language": "en",
  "type": "content",
  "no_variables": false,
  "weight": {
    "pageRank": 0,
    "level": 0,
    "position": 15
  },
  "hierarchy": {
    "lvl0": "Access Your Database",
    "lvl1": "Interactive Remote Console",
    "lvl2": "Manually install the databases CLI in one-off",
    "lvl3": null,
    "lvl4": null,
    "lvl5": null,
    "lvl6": null
  },
  "recordVersion": "v2",
  "hierarchy_radio": {
    "lvl0": null,
    "lvl1": null,
    "lvl2": null,
    "lvl3": null,
    "lvl4": null,
    "lvl5": null,
    "lvl6": null
  },
  "hierarchy_camel": [
    {
      "lvl0": "Access Your Database",
      "lvl1": "Interactive Remote Console",
      "lvl2": "Manually install the databases CLI in one-off",
      "lvl3": null,
      "lvl4": null,
      "lvl5": null,
      "lvl6": null
    }
  ],
  "hierarchy_radio_camel": {
    "lvl0": null,
    "lvl1": null,
    "lvl2": null,
    "lvl3": null,
    "lvl4": null,
    "lvl5": null,
    "lvl6": null
  },
  "objectID": "15-https://doc.scalingo.com/platform/databases/access"
}

We could have imagined that, due to a question of hierarchy, the result was too low to appear in research. But only 14 search results appear when dbclient-fetcher is typed, and it's not our limit. When searching for e, 20 results appear.

From what we understood, this number matches the hitsPerPage parameter, https://dashboard.algolia.com/apps/RWJM2H1BD2/explorer/configuration/scalingo-doc/pagination It's set to 20 by default, we tried to edit it to 30 but saw no change.

If we add the word script, the result appears. image

It's not a question of bad index, because it's shown if adding an other word (per example . It's not a question of max limit of searches shown, because only 14 results are displayed, not 20. So it's also not a question of hierarchy/priority, because if it was bad, it would appear as the last result.

aurelien-reeves-scalingo commented 1 year ago

Back to Backlog. We'll continue investigation later

Anouchka-M commented 1 year ago

I got an interesting answer from Algolia that should solve the problem :

In regards to the dbclient-fetcher query, after having a look at your index I can see the reason why the 2 results don't appear is due to the distinct attribute set to true and the attribute for distinct is url. This means that the engine will return only the most relevant variant for each group (same url).

If you turn of the distinct setting in the Add Query Parameter to test this, you will find the wanted results on 7th and 8th position:

Now, the reason why they are not ranked higher is due to Position criterion which is on the "6th word" vs "5th word" for the preceding results:

If you wish to see these 2 results on top, the best option is to pin them with rules. You can find more information on pinning with rules here https://www.algolia.com/doc/guides/managing-results/rules/merchandising-and-promoting/how-to/promote-hits/.