mysociety / alaveteli

Provide a Freedom of Information request system for your jurisdiction
https://alaveteli.org
Other
389 stars 195 forks source link

Internal site search is generally poor #1179

Open hsenag opened 11 years ago

hsenag commented 11 years ago

I think it's generally known that the internal site search on Alaveteli is quite weak - on WDTK we often get people saying that they can't find a particular body just because the obvious search terms don't pick it up.

I can't find a general issue on the topic, so thought it's worth raising one that we can add examples to as we encounter them.

hsenag commented 11 years ago

Existing specific problems with search: #932

Problems with search counts: #122, #456

hsenag commented 11 years ago

One standard workaround we often recommend to users is simply to use Google - e.g. search for "site:www.whatdotheyknow.com king's".

hsenag commented 10 years ago

Searching for things in quotes (e.g. "psychological assessment") returns results where the words are separated.

garethrees commented 9 years ago

For some reason it doesn't even display direct matches for some searches.

Searching for "Cardiff Council" returns 2 pages, with the actual body "Cardiff Council" on page 2

RichardTaylor commented 8 years ago

A WhatDoTheyKnow user has asked how requests are ordered on a public body page:

https://www.whatdotheyknow.com/body/tfl

One request from two years previously appeared high up in the results; apparently as it had just been classified. A non-admin browser can't see the classification date so this is inexplicable behaviour to them.

Added surprise: the old recently classified request only appears on the public body page when not logged in; when I'm logged in as an admin it's not there.

RichardTaylor commented 7 years ago

A WhatDoTheyKnow user had trouble finding the Scotland Office. ( "Scotland Office" is a body on WhatDoTheyKnow )

A search for "Scotland Office" without quotes at: https://www.whatdotheyknow.com/select_authority doesn't include Scotland Office in the first page of results.

The Scotland Office is currently the fifth result for a search for Scotland Office via the search bar

https://www.whatdotheyknow.com/search/Scotland%20Office/all

The user wrote to explain why they'd been struggling:

On the site if I go into the View Authorities option I can find both the Scotland Office and the Wales Office through the search available within View Authorities. If I go to the Make a Request Option and then do a search neither the Scotland Office or Wales Office appear. The searches might be viewing different datasets. In my case I was searching for the Scotland Office and the Wales Office through the search option available under Make a Request.

equivalentideas commented 7 years ago

Another example for this ticket: In the public bodies search "WA Attorney General" can't find "WA Department of the Attorney General".

https://www.righttoknow.org.au/body/list/all?utf8=%E2%9C%93&public_body_query=WA+Attorney+General&commit=Search

https://www.righttoknow.org.au/body/wa_department_of_the_attorney_general

But, lose the "WA" and it works :S https://www.righttoknow.org.au/body/list/all?utf8=%E2%9C%93&public_body_query=Attorney+General&commit=Search

crowbot commented 6 years ago

I've created a new ticket (https://github.com/mysociety/alaveteli/issues/4426) for the specific issue of quoted authority searches not giving a direct match as the top result.

RichardTaylor commented 6 years ago

A search for:

:Information Commissioner

doesn't currently return the Information Commissioner's Office in the first page of search results on WhatDoTheyKnow.

This was noted by a user who also reported difficulty finding police forces via search.

garethrees commented 5 years ago
graeme  interesting idea... xapian with writes done via sidekiq allowing multiple app instances or containers https://github.com/gernotkogler/xapian_db#installation-with-sidekiq
RichardTaylor commented 5 years ago

A WhatDoTheyKnow user has been in touch to point out a search for "Home Office" doesn't return the Home Office on the first page of results

MattK1234 commented 4 years ago

A user has contacted the WhatDoTheyKnow Administration Team in relation to this problem:

I have noticed that when you are searching for a public authority if you don't use a capital letter for the first of the authority then sometimes a search returns blank and some key words do not appear in searches. But it is not consistent for all authorities. Sometimes if a capital letter is not used my search engine will highlight that word as spelt incorrectly on your page. Unsure if this then stops a search? If I search lewisham borough council - there are no results, but if I search lewisham, lewisham borough or Lewisham Borough Council then results will appear. If I search for council/Council there are no results but if I search for B/borough then many results are found which include Borough Council.

RichardTaylor commented 3 years ago

Specific issue raised with

https://www.whatdotheyknow.com/select_authority

Searches for 'health', 'Health' or 'department of health' or 'department for health' or variations on that theme don't result in any hits at the moment.

Generally a "no results found" response doesn't even appear leading to reports that the service is broken.

skenaja commented 3 years ago

Another example I found recently:

https://www.whatdotheyknow.com/select_authority?utf8=%E2%9C%93&query=companies+house&bodies=1&commit=Search

garethrees commented 3 years ago

A public complaint https://twitter.com/smithsam/status/1330825609723961345

RichardTaylor commented 3 years ago

I'm adding the reduce-admin flag because on WhatDoTheyKnow we're getting people asking us to list bodies we do have on the site .

This is also a sign the service isn't helping people it could be.. some people might give up or make a request privately rather than ask us to list a body.

RichardTaylor commented 3 years ago

+1 .. a user couldn't find a major county council's page on our site and asked us to add it.

Their response when pointed to it:

Great thanks very much, but it didn't come up when I searched for it.

RichardTaylor commented 3 years ago

+1 another case where a user has written to ask us to add a major body we already list, this time Birmingham City Council

RichardTaylor commented 3 years ago

+2 more WhatDoTheyKnow.com users asking us to add bodies we already list, one noting they had tried to find it.

RichardTaylor commented 3 years ago

+1 another user wrote to WhatDoTheyKnow.com today, this time writing: "This borough council is not represented on your website". WhatDoTheyKnow lists all UK borough councils - however it appears the user couldn't find the one they wanted.

RichardTaylor commented 3 years ago

The user referred to in the above comment has clarified: "For some reason this info didn't appear when I entered it into the search box."

MattK1234 commented 3 years ago

Comment on Twitter relating to the site search not accounting for mixed cases: https://twitter.com/legalfeminist/status/1357445448395546626?s=20

garethrees commented 3 years ago

Nice article about improving search with a combination of Ruby and Postgres https://blog.testdouble.com/posts/2021-09-09-how-to-build-a-search-engine-with-ruby-on-rails/

RichardTaylor commented 2 years ago

A user has written to the WhatDoTheyKnow team setting out their experience of trying to use the search functions, concluding:

the search function of this website is of limited use.

One of the issues raised was the treatment of "for" when they searched for a phrase, I've noted that on the specific issue for stop words:

https://github.com/mysociety/alaveteli/issues/1575#issuecomment-1018018883

mdeuk commented 1 year ago

+1 - we recently had a user contact WDTK to add a public body (Staffordshire County Council), that has been listed since 2008.

The issue here seems to be that a search for Staffordshire produces 94 results, and you'd need to get to page 5 of the results before the correct body was listed.

WDTK now lists a lot of Schools with geographic information included in their names (sourced from GIaS) - so this is quite possibly a regular frustration for users. Would 'weighting' of results be a potential option?

HelenWDTK commented 1 year ago

+1 We've been contacted by a user who asked us to add North Yorkshire Council. Searching for North Yorkshire Council without quotes doesn't return the council on the first 4 pages of the search results.

HelenWDTK commented 1 year ago

+1 We've been contacted by a user who couldn't find Somerset Council.

WilliamWDTK commented 1 year ago

+1, We've been contacted by a user who didn't find NHS England.

image

image

confirmordeny commented 1 year ago

A user was unable to view past the 20th page of search results.

garethrees commented 1 year ago

A user was unable to view past the 20th page of search results

Specifically tracked in https://github.com/mysociety/alaveteli/issues/2137.

garethrees commented 1 year ago

https://docs.paradedb.com/blog/introducing_bm25

laurentS commented 9 months ago

FWIW, madada.fr has similar issues. I'm exploring how to improve this for us beyond just switching the stemmer language, and will report back if we manage to make progress.

laurentS commented 9 months ago

Coming back to this with some (basic) findings:

Further thoughts:

WilliamWDTK commented 2 weeks ago

For what it's worth, we've had another WhatDoTheyKnow user get in touch to report that NHS England can't be found through https://www.whatdotheyknow.com/select_authority?bodies=1&commit=Search&page=1&pro=0&query=NHS+England&utf8=%E2%9C%93.