laws-africa / peachjam

Project Peach Jam
https://agp.africanlii.org
GNU General Public License v3.0
3 stars 0 forks source link

As a user of lawlibrary.org.za I would like to know if the topic I am researching is covered on the AGP platform #521

Open NotoriousMBB opened 2 years ago

NotoriousMBB commented 2 years ago

by receiving a group of AGP search results on the keyword searched as one of the filtering options in my lawlibrary.org.za search results page

longhotsummer commented 1 year ago

Proposal

The idea is to help users of a local LII website find related content either in the AU materials on AfricanLII.org (primarily), or on other local LIIs (secondary) -- let's call them offsite results. The value is that users of a local LII may not know what regional or AU materials relate to their research, and we want to help them realise that regional materials may apply.

In particular, we're doing this only when the user is searching. Other mechanisms (eg. on a document detail page) may come later.

The goal is not to mix all matching offsite results (which could be a lot) into local search results. Rather, it's to hint to the user that there may be others worth exploring. As such, it may be sufficient to show just 1-3 offsite results and indicate that they are external and there may be more worth looking at.

Basic user experience

At a high level, the experience looks like this:

  1. I search for "refugee" on lawlibrary.org.za
  2. the site shows me search results as usual
  3. in the background, lawlibrary asks agp.africanlii.org for just 3 relevant matches to the same search, from the AGP database
  4. the offsite results (if any) are inserted into the local search result list at about position 3
  5. they are formatted such that it's clear they are supplemental search results from a separate site
  6. the user can click through to one of the 3 offsite results (open in a new tab), or they can click through to the full search results on the remote website (open in a new tab)

The user may want to be able to opt-out of showing remote results completely.

Here's an example of what this could look like:

image

Questions

  1. how do we avoid making the offsite results look like paid adverts?
  2. how do we avoid the offsite results taking up too much room?

Technical details

Some constraints:

  1. we don't want to make search results twice as slow just because we're conducting two searches; so the offsite results should happen in parallel, or after the local results are presented
  2. for many search terms, such as "refugee", there will almost always be offsite results, even if they are poor quality. We should only show them if they're the same or better than the local results, to avoid spam.
  3. we should apply the same filters on the remote results as local results -- this means that local filters that aren't applicable remotely (eg. locality == "Gauteng") would result in no offsite results
  4. this is only ever done for the first page of results

The search flow could look like this:

  1. a lawlibrary.org.za user visit /search, enters their search term
  2. JS on the search page loads search results with /api/search/documents as usual
  3. JS gets the score of the last result on the first page -- this is the score that an offsite result must match (or better) to be included
  4. JS loads /api/search/documents/external, giving it the same search parameters and the minimum score
  5. on the server, it calls agp.africanlii.org/api/search/documents/federated with the search params
  6. on agp.africanlii.org, the server does the search in ES and limits the results to a maximum of 3 (or whatever) and ensures that results must be scored greater than the minimum score
  7. the results are sent back to lawlibrary.org.za and then back to the client
  8. JS on the client injects the offsite results, if any, as the 3rd search result on the page

Note that we're assuming that the scores are comparable. We think this is safe because the structure of the indexes and scoring mechanisms are the same for all peachjam-based indexes.

longhotsummer commented 1 year ago

Federated search

We could use a similar device to bring local LII results into agp.africanlii.org, or to bring remote LII results into a local LII (not just results from AGP).

NotoriousMBB commented 2 months ago

Folks, this is a deliverable for GIZ and AGP project. Has this been implemented? i.e. I need to be able to see AfricanLII results in TanzLII as well as browse regional materials @niiroobiro

niiroobiro commented 2 months ago

Re-opening for further discussion.