mysociety / alaveteli

Provide a Freedom of Information request system for your jurisdiction
https://alaveteli.org
Other
389 stars 195 forks source link

[ERROR] public_body#show (RuntimeError) "Two bodies with the same historical URL name…" #6495

Open garethrees opened 3 years ago

garethrees commented 3 years ago

This is the underlying cause of the first issue described in https://github.com/mysociety/whatdotheyknow-theme/issues/879.

A RuntimeError occurred in public_body#show:

Two bodies with the same historical URL name: highways_england
app/models/public_body.rb:292:in `find_by_url_name_with_historic'

-------------------------------
Request:
-------------------------------

* URL : https://www.whatdotheyknow.com/body/highways_england
* HTTP Method: GET
* Parameters : {"view"=>"all", "controller"=>"public_body", "action"=>"show", "url_name"=>"highways_england"}
* Timestamp : 2021-08-27 10:16:23 +0100
* Rails root : /data/vhost/www.whatdotheyknow.com/alaveteli-2021-08-20T12-31-10

-------------------------------
Session:
-------------------------------

* session id: [FILTERED]
* data: {"session_id"=>"REDACTED",
"locale"=>"en",
"user_id"=>802,
"ttl"=>"2021-08-27T10:15:19.533+01:00",
"remember_me"=>false,
"post_redirect_token"=>"REDACTED",
"_csrf_token"=>"REDACTED",
"using_admin"=>1,
"admin_name"=>"REDACTED"}

-------------------------------
Backtrace:
-------------------------------

app/models/public_body.rb:292:in `find_by_url_name_with_historic'
app/controllers/public_body_controller.rb:31:in `block in show'
lib/alaveteli_localization.rb:48:in `with_locale'
app/controllers/public_body_controller.rb:30:in `show'
app/controllers/application_controller.rb:110:in `record_memory'
lib/strip_empty_sessions.rb:14:in `call'
garethrees commented 3 years ago
name = 'highways_england'
# => "highways_england"

PublicBody::Version.
  where(:url_name => name).
  distinct.
  pluck(:public_body_id)
# => [30, 55508]

PublicBody.find(30).name
# => "Highways Agency"

PublicBody.find(55508).name
# => "Highways England Company Limited"
garethrees commented 3 years ago

I've fixed the specific case (https://github.com/mysociety/whatdotheyknow-theme/issues/879#issuecomment-907123559), but really we should prevent an authority being saved if this error case is going to happen.

garethrees commented 3 years ago

Just noting IPZ has also run in to this https://github.com/codeforcroatia/imamopravoznati/issues/52.

laurentS commented 2 years ago

We've found this to be a problem as we are exploring how to link our data with wikidata in a similar fashion to what you did with https://www.wikidata.org/wiki/Property:P8167 (which relies on the url_name as an id to link back to WhatDoTheyKnow).

Specifically, it is possible to change the name or short name of a public body so that its url_name takes over a historic url_name of another body. On the other hand, it is not possible to rename a body to cause a conflict with a current url_name, which seems like the correct way to handle things.

Would you be interested in a small PR that prevents changing names which would result in historic name conflicts, and throws a similar error to the one raised with conflicting current names? I think this would guarantee the stability of url_name over time, and therefore make good urls that don't change :)

garethrees commented 2 years ago

Would you be interested in a small PR that prevents changing names which would result in historic name conflicts

This seems pretty sensible but I'm wondering whether there's ever a case where this would be a problem. To play this out a little…

  1. We have Ministry of Defence (/mod)
  2. We all decide to be nice to each other; MoD closes.
  3. Years go by…
  4. We feel we need a Ministry of Doge.
  5. Everyone calls it the MoD. Having the slug as /mod would be most help to our users.

I think in WDTK we'd deal with this by:

  1. Removing the shortname of Ministry of Defence, so that it becomes /ministry_of_defence to free up /mod.
  2. At this point the /mod slug would be "historic"
  3. Create the new body with the MoD shortname to get the /mod slug
  4. Add a note to both bodies pointing them at one another by their current slug saying "there's this other body that used to/now goes by MoD"

This would break the wikidata linking though. It's not in our current practice to check that.

In any case, I think we should still go ahead with the PR. If we did find ourselves in this situation and really did want to break the URL, I think we could manually edit/destroy the PublicBody::Version with the conflicting url_name, but I agree that by default avoiding these conflicts would be best.