Open hsenag opened 11 years ago
Existing specific problems with search: #932
Problems with search counts: #122, #456
One standard workaround we often recommend to users is simply to use Google - e.g. search for "site:www.whatdotheyknow.com king's".
Searching for things in quotes (e.g. "psychological assessment") returns results where the words are separated.
For some reason it doesn't even display direct matches for some searches.
Searching for "Cardiff Council" returns 2 pages, with the actual body "Cardiff Council" on page 2
A WhatDoTheyKnow user has asked how requests are ordered on a public body page:
https://www.whatdotheyknow.com/body/tfl
One request from two years previously appeared high up in the results; apparently as it had just been classified. A non-admin browser can't see the classification date so this is inexplicable behaviour to them.
Added surprise: the old recently classified request only appears on the public body page when not logged in; when I'm logged in as an admin it's not there.
A WhatDoTheyKnow user had trouble finding the Scotland Office. ( "Scotland Office" is a body on WhatDoTheyKnow )
A search for "Scotland Office" without quotes at: https://www.whatdotheyknow.com/select_authority doesn't include Scotland Office in the first page of results.
The Scotland Office is currently the fifth result for a search for Scotland Office via the search bar
https://www.whatdotheyknow.com/search/Scotland%20Office/all
The user wrote to explain why they'd been struggling:
On the site if I go into the View Authorities option I can find both the Scotland Office and the Wales Office through the search available within View Authorities. If I go to the Make a Request Option and then do a search neither the Scotland Office or Wales Office appear. The searches might be viewing different datasets. In my case I was searching for the Scotland Office and the Wales Office through the search option available under Make a Request.
Another example for this ticket: In the public bodies search "WA Attorney General" can't find "WA Department of the Attorney General".
https://www.righttoknow.org.au/body/wa_department_of_the_attorney_general
But, lose the "WA" and it works :S https://www.righttoknow.org.au/body/list/all?utf8=%E2%9C%93&public_body_query=Attorney+General&commit=Search
I've created a new ticket (https://github.com/mysociety/alaveteli/issues/4426) for the specific issue of quoted authority searches not giving a direct match as the top result.
A search for:
:Information Commissioner
doesn't currently return the Information Commissioner's Office in the first page of search results on WhatDoTheyKnow.
This was noted by a user who also reported difficulty finding police forces via search.
graeme interesting idea... xapian with writes done via sidekiq allowing multiple app instances or containers https://github.com/gernotkogler/xapian_db#installation-with-sidekiq
A WhatDoTheyKnow user has been in touch to point out a search for "Home Office" doesn't return the Home Office on the first page of results
A user has contacted the WhatDoTheyKnow Administration Team in relation to this problem:
I have noticed that when you are searching for a public authority if you don't use a capital letter for the first of the authority then sometimes a search returns blank and some key words do not appear in searches. But it is not consistent for all authorities. Sometimes if a capital letter is not used my search engine will highlight that word as spelt incorrectly on your page. Unsure if this then stops a search? If I search lewisham borough council - there are no results, but if I search lewisham, lewisham borough or Lewisham Borough Council then results will appear. If I search for council/Council there are no results but if I search for B/borough then many results are found which include Borough Council.
Specific issue raised with
https://www.whatdotheyknow.com/select_authority
Searches for 'health', 'Health' or 'department of health' or 'department for health' or variations on that theme don't result in any hits at the moment.
Generally a "no results found" response doesn't even appear leading to reports that the service is broken.
Another example I found recently:
A public complaint https://twitter.com/smithsam/status/1330825609723961345
I'm adding the reduce-admin flag because on WhatDoTheyKnow we're getting people asking us to list bodies we do have on the site .
This is also a sign the service isn't helping people it could be.. some people might give up or make a request privately rather than ask us to list a body.
+1 .. a user couldn't find a major county council's page on our site and asked us to add it.
Their response when pointed to it:
Great thanks very much, but it didn't come up when I searched for it.
+1 another case where a user has written to ask us to add a major body we already list, this time Birmingham City Council
+2 more WhatDoTheyKnow.com users asking us to add bodies we already list, one noting they had tried to find it.
+1 another user wrote to WhatDoTheyKnow.com today, this time writing: "This borough council is not represented on your website". WhatDoTheyKnow lists all UK borough councils - however it appears the user couldn't find the one they wanted.
The user referred to in the above comment has clarified: "For some reason this info didn't appear when I entered it into the search box."
Comment on Twitter relating to the site search not accounting for mixed cases: https://twitter.com/legalfeminist/status/1357445448395546626?s=20
Nice article about improving search with a combination of Ruby and Postgres https://blog.testdouble.com/posts/2021-09-09-how-to-build-a-search-engine-with-ruby-on-rails/
A user has written to the WhatDoTheyKnow team setting out their experience of trying to use the search functions, concluding:
the search function of this website is of limited use.
One of the issues raised was the treatment of "for" when they searched for a phrase, I've noted that on the specific issue for stop words:
https://github.com/mysociety/alaveteli/issues/1575#issuecomment-1018018883
+1 - we recently had a user contact WDTK to add a public body (Staffordshire County Council), that has been listed since 2008.
The issue here seems to be that a search for Staffordshire
produces 94 results, and you'd need to get to page 5 of the results before the correct body was listed.
WDTK now lists a lot of Schools with geographic information included in their names (sourced from GIaS) - so this is quite possibly a regular frustration for users. Would 'weighting' of results be a potential option?
+1 We've been contacted by a user who asked us to add North Yorkshire Council. Searching for North Yorkshire Council without quotes doesn't return the council on the first 4 pages of the search results.
+1 We've been contacted by a user who couldn't find Somerset Council.
+1, We've been contacted by a user who didn't find NHS England.
A user was unable to view past the 20th page of search results.
A user was unable to view past the 20th page of search results
Specifically tracked in https://github.com/mysociety/alaveteli/issues/2137.
FWIW, madada.fr has similar issues. I'm exploring how to improve this for us beyond just switching the stemmer language, and will report back if we manage to make progress.
Coming back to this with some (basic) findings:
acts_as_xapian.rb
file into our theme because I couldn't figure out how to use class_eval
properly in this case)Further thoughts:
For what it's worth, we've had another WhatDoTheyKnow user get in touch to report that NHS England can't be found through https://www.whatdotheyknow.com/select_authority?bodies=1&commit=Search&page=1&pro=0&query=NHS+England&utf8=%E2%9C%93
.
We've had a user get in touch to complain that they were unable to find an authority (Adams' Grammar School) through the pro
authority selector, using its name as a search term.
I think it's generally known that the internal site search on Alaveteli is quite weak - on WDTK we often get people saying that they can't find a particular body just because the obvious search terms don't pick it up.
I can't find a general issue on the topic, so thought it's worth raising one that we can add examples to as we encounter them.