mysociety / alaveteli

Provide a Freedom of Information request system for your jurisdiction
https://alaveteli.org
Other
389 stars 195 forks source link

Bug on admin summary page when viewing authorities with their FOI email missing #5084

Closed ghost closed 5 years ago

ghost commented 5 years ago

There's a bug there in that there should be an authority name next to each "eye" symbol. Please see uploaded screenshot showing issue.

screen shot 2019-01-30 at 14 54 42

Best Wishes Ryan

mdeuk commented 5 years ago

+1, I can reproduce this bug on WDTK. If memory serves, the bodies listed are ones which do not have a Welsh localisation (name and email), however, ordinarily the name of the body would have been displayed in this view in English.

RichardTaylor commented 5 years ago

So probably linked to #4436 (Authorities showing as missing FOI email addresses) but a bug in its own right too.

The bug means the link to the authority in question isn't available to administrators.

Also the number "296" in this case doesn't appear to correspond with the number of rows below it.

On a closely related point it might be useful if defunct and not_apply bodies were excluded from this list (I don't know if they are already).

garethrees commented 5 years ago

The problem here is that a cy translation record exists for these authorities, but all its attributes are empty string (so "blank", rather than "NULL").

That means that we don't fall back to the EN translation.

I can't figure out how they got in to this state, as in general the translation records are only created when there's something meaningful to store. Doesn't look like it was a CSV import that created the translation records, but I can't be sure.

There are 142 records where the CY translation name attribute is empty

SELECT * FROM public_body_translations WHERE name = '';

I think we can just delete these as there's no meaningful data stored in other columns either.

There's one exception to this:

SELECT * FROM public_body_translations WHERE locale = 'cy' AND public_body_id = 4642;
  id   | public_body_id | locale | publication_scheme | notes | first_letter | request_email |   url_name   |  short_name  | name |         created_at         |        updated_at         | disclosure_log
-------+----------------+--------+--------------------+-------+--------------+---------------+--------------+--------------+------+----------------------------+---------------------------+----------------
 52173 |           4642 | cy     |                    |       | [NULL]       |               | bws_caerdydd | Bws Caerdydd |      | 2016-06-05 22:24:09.594935 | 2018-02-27 13:27:59.73357 | [NULL]

In this case it looks like the short_name has been intentionally filled out, but we need a name translation too.

RichardTaylor commented 5 years ago

The body with a translated short name, but an untranslated name is: https://www.whatdotheyknow.com/body/cardiff_bus https://www.whatdotheyknow.com/cy/body/bws_caerdydd

At the time of writing the body's public Welsh language page didn't contain the body name. I've now copied the English name across, because it's a limited company, it's name is what it is and is in English - that's the only name it formally has as I understand it. I can't see a Welsh name field at Companies House https://beta.companieshouse.gov.uk/company/02001229

garethrees commented 5 years ago

I've just come to look at this with the aim of deleting the blank welsh translations, and now there are 184 records (up from 142) where the CY name attribute is empty:

foi=> SELECT COUNT(*) FROM public_body_translations WHERE name = '';
 count
-------
   184
garethrees commented 5 years ago

Noting this for future reference:

\pset null 'Ø'
garethrees commented 5 years ago

I must have mis-counted originally.

The last record with a blank String name was created long before https://github.com/mysociety/alaveteli/issues/5084#issuecomment-461796517.

SELECT created_at FROM public_body_translations WHERE name = '' ORDER BY created_at DESC LIMIT 1;
         created_at
----------------------------
 2018-04-27 13:59:22.998517

I think what must be happening here is that for a time, we must have allowed blank translations to get created. The model validation appears to still allow this, but we must have done something at some point to prevent blank translations being submitted.

Now, when a body with an existing blank CY translation is updated, the translation also gets updated, but blank translations are no longer created.

Lets look at the most recently updated blank translation:

SELECT * FROM public_body_translations WHERE name = '' ORDER BY updated_at DESC LIMIT 1;
  id   | public_body_id | locale | publication_scheme | notes | first_letter | request_email | url_name | short_name | name |         created_at         |         updated_at         | disclosure_log
-------+----------------+--------+--------------------+-------+--------------+---------------+----------+------------+------+----------------------------+----------------------------+----------------
 59341 |            246 | cy     |                    |       | Ø            |               | body     |            |      | 2017-04-27 10:16:28.724009 | 2019-05-13 17:34:25.308115 | Ø

We can see the update includes the cy params:

I, [2019-05-13T18:34:25.213308 #13026]  INFO -- : Started PUT "/admin/bodies/246" for REDACTED at 2019-05-13 18:34:25 +0100
I, [2019-05-13T18:34:25.219152 #13026]  INFO -- : Processing by AdminPublicBodyController#update as HTML
I, [2019-05-13T18:34:25.219331 #13026]  INFO -- :   Parameters: {"utf8"=>"✓", "authenticity_token"=>"REDACTED", "public_body"=>{"locale"=>"en", "name"=>"Lincolnshire County Council", "short_name"=>"", "request_email"=>"REDACTED", "publication_scheme"=>"http://www.lincolnshire.gov.uk/section.asp?catid=20718", "notes"=>"", "translations_attributes"=>{"cy"=>{"locale"=>"cy", "name"=>"", "short_name"=>"", "request_email"=>"", "publication_scheme"=>"", "notes"=>"", "id"=>"59341"}}, "tag_string"=>"local_council", "home_page"=>"", "disclosure_log"=>"", "last_edit_comment"=>"Requested by: Authority (REDACTED)\r\nSource URL: direct"}, "change_request_id"=>"3676", "subject"=>"Re: Update email address - Lincolnshire County Council", "response"=>"Dear Authority,\r\n\r\nThanks for your suggestion to update the email address for Lincolnshire County Council to REDACTED. This has now been done and any new requests will be sent to the new address.\r\n\r\nYours,\r\n\r\nThe WhatDoTheyKnow team.\r\n", "commit"=>"Save", "id"=>"246"}

Now, lets take a body I just updated – 245. Again we can see the cy params:

I, [2019-05-20T06:50:43.650757 #4406]  INFO -- : Started PUT "/admin/bodies/245" for REDACTED at 2019-05-20 06:50:43 +0100
I, [2019-05-20T06:50:43.653663 #4406]  INFO -- : Processing by AdminPublicBodyController#update as HTML
I, [2019-05-20T06:50:43.653832 #4406]  INFO -- :   Parameters: {"utf8"=>"✓", "authenticity_token"=>"REDACTED", "public_body"=>{"locale"=>"en", "name"=>"Lincoln City Council", "short_name"=>"", "request_email"=>"REDACTED", "publication_scheme"=>"https://www.lincoln.gov.uk/your-council/data-transparency/freedom-of-information/publication-scheme/", "notes"=>"", "translations_attributes"=>{"cy"=>{"locale"=>"cy", "name"=>"", "short_name"=>"", "request_email"=>"", "publication_scheme"=>"", "notes"=>""}}, "tag_string"=>"local_council", "home_page"=>"http://www.lincoln.gov.uk", "disclosure_log"=>"", "last_edit_comment"=>"Add publication scheme"}, "commit"=>"Save", "id"=>"245"}

…but an empty translation doesn't get created:

SELECT locale FROM public_body_translations WHERE public_body_id = 245;
 locale
--------
 en

The notable difference in the params is that 246 is being updated through a change request. I don't think that's a problem here, as there's no evidence that change requests are creating empty translations (surely see a much more recent created_at in the translations table if that was the case).

So, I think the original course of action should still fix this:

PublicBody::Translation.where(name: '', locale: 'cy').destroy_all

I've run this now.

garethrees commented 5 years ago

There are still some cases where the name fails to show:

Screenshot 2019-05-20 at 07 33 03

This is because the only translation that gets loaded is the one with a missing request email:

 # https://github.com/mysociety/alaveteli/blob/bf2658c61be59304f5c609418f116861764a80b1/app/controllers/admin_general_controller.rb#L36-L38

@blank_contacts = PublicBody.without_request_email.
                                 limit(20).
                                   includes(:tags, :translations)

b = @blank_contacts.second
b.translations.map(&:locale)
# => [:cy]

b.name
# => nil
b.name(:en)
# => nil
b.name(:cy)
# => "Comisiynydd Plant"
garethrees commented 5 years ago

The problem here is that we add includes(:tags, :translations) on the end of the PublicBody.without_request_email scope.

# spec/models/public_body_spec.rb
describe '.without_request_email' do
  # …
  it 'allows you to read the default locale name' do
    blank_body.destroy!
    public_body.destroy!

    public_body = FactoryBot.create(:public_body)
    public_body.request_email = 'en@example.com'
    public_body.save

    AlaveteliLocalization.with_locale(:es) do
      public_body.name = 'CY name'
      public_body.request_email = ''
      public_body.save
    end

    # COMMENT / UNCOMMENT INCLUDES CALL TO SEE SPEC PASS/FAIL
    bodies = PublicBody.without_request_email#.includes(:tags, :translations)
    # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    puts 'bodies count: ' + bodies.size.to_s
    print 'locales: '
    p bodies.last.translations.map(&:locale)
    puts 'en name: ' + bodies.last.name.to_s
    puts 'en email: ' + bodies.last.request_email.to_s
    puts 'es name: ' +bodies.last.name(:es).to_s
    AlaveteliLocalization.with_locale(:es) do
      puts 'es email: ' + bodies.last.request_email.to_s
    end

    expect(bodies.last.name).to_not be_nil
    expect(bodies.last.name).to_not be_blank
  end
end
$ bin/rspec spec/models/public_body_spec.rb:895
Run options: include {:locations=>{"./spec/models/public_body_spec.rb"=>[895]}}

Randomized with seed 59654
bodies count: 1
locales: [:en, :es]
en name: Example Public Body 4
en email: en@example.com
es name: CY name
es email:
.

Finished in 1.22 seconds (files took 15.97 seconds to load)
1 example, 0 failures

Uncommenting .includes(:tags, :translations) leads to a failure:

$ bin/rspec spec/models/public_body_spec.rb:895
Run options: include {:locations=>{"./spec/models/public_body_spec.rb"=>[895]}}

Randomized with seed 32073
bodies count: 1
locales: [:es]
en name:
en email:
es name: CY name
es email:
F

Failures:

  1) PublicBody.without_request_email includes bodies with a translation that has an empty request email
     Failure/Error: expect(bodies.last.name).to_not be_nil

       expected: not nil
            got: nil
     # ./spec/models/public_body_spec.rb:921:in `block (3 levels) in <top (required)>'

Finished in 1.4 seconds (files took 17.91 seconds to load)
1 example, 1 failure
garethrees commented 5 years ago
Body.without_request_email.to_sql
# => SELECT "public_bodies".* FROM "public_bodies" INNER JOIN "public_body_translations" ON "public_body_translations"."public_body_id" = "public_bodies"."id" WHERE "public_body_translations"."request_email" = '' AND (NOT (EXISTS(SELECT 1 FROM "has_tag_string_tags" WHERE (has_tag_string_tags.model_id = public_bodies.id) AND (has_tag_string_tags.model = 'PublicBody') AND "has_tag_string_tags"."name" = 'defunct')))

PublicBody.without_request_email.includes(:tags, :translations).to_sql
# => SELECT "public_bodies"."id" AS t0_r0, "public_bodies"."version" AS t0_r1, "public_bodies"."last_edit_editor" AS t0_r2, "public_bodies"."last_edit_comment" AS t0_r3, "public_bodies"."created_at" AS t0_r4, "public_bodies"."updated_at" AS t0_r5, "public_bodies"."home_page" AS t0_r6, "public_bodies"."api_key" AS t0_r7, "public_bodies"."info_requests_count" AS t0_r8, "public_bodies"."disclosure_log" AS t0_r9, "public_bodies"."info_requests_successful_count" AS t0_r10, "public_bodies"."info_requests_not_held_count" AS t0_r11, "public_bodies"."info_requests_overdue_count" AS t0_r12, "public_bodies"."info_requests_visible_classified_count" AS t0_r13, "public_bodies"."info_requests_visible_count" AS t0_r14, "has_tag_string_tags"."id" AS t1_r0, "has_tag_string_tags"."model_id" AS t1_r1, "has_tag_string_tags"."name" AS t1_r2, "has_tag_string_tags"."created_at" AS t1_r3, "has_tag_string_tags"."value" AS t1_r4, "has_tag_string_tags"."model" AS t1_r5, "has_tag_string_tags"."updated_at" AS t1_r6, "public_body_translations"."id" AS t2_r0, "public_body_translations"."public_body_id" AS t2_r1, "public_body_translations"."locale" AS t2_r2, "public_body_translations"."created_at" AS t2_r3, "public_body_translations"."updated_at" AS t2_r4, "public_body_translations"."name" AS t2_r5, "public_body_translations"."short_name" AS t2_r6, "public_body_translations"."request_email" AS t2_r7, "public_body_translations"."url_name" AS t2_r8, "public_body_translations"."notes" AS t2_r9, "public_body_translations"."first_letter" AS t2_r10, "public_body_translations"."publication_scheme" AS t2_r11, "public_body_translations"."disclosure_log" AS t2_r12 FROM "public_bodies" INNER JOIN "public_body_translations" ON "public_body_translations"."public_body_id" = "public_bodies"."id" LEFT OUTER JOIN "has_tag_string_tags" ON "has_tag_string_tags"."model_id" = "public_bodies"."id" AND "has_tag_string_tags"."model" = 'PublicBody' WHERE "public_body_translations"."request_email" = '' AND (NOT (EXISTS(SELECT 1 FROM "has_tag_string_tags" WHERE (has_tag_string_tags.model_id = public_bodies.id) AND (has_tag_string_tags.model = 'PublicBody') AND "has_tag_string_tags"."name" = 'defunct')))
garethrees commented 5 years ago

@gbp any idea why includes(:tags, :translations) was added in https://github.com/mysociety/alaveteli/commit/2daee6c881daca0a8265d890f80583e08775df72? (see from https://github.com/mysociety/alaveteli/issues/5084#issuecomment-493856461 down for context).

garethrees commented 5 years ago

There's also the original case where the translations are mostly NULL values:

SELECT * FROM public_body_translations WHERE locale = 'cy' AND name IS NULL ORDER BY created_at DESC LIMIT 1;
  id   | public_body_id | locale | publication_scheme | notes | first_letter | request_email | url_name | short_name | name |         created_at         |         updated_at         | disclosure_log
-------+----------------+--------+--------------------+-------+--------------+---------------+----------+------------+------+----------------------------+----------------------------+----------------
 83504 |            174 | cy     | Ø                  | Ø     | Ø            | Ø             | Ø        | Ø          | Ø    | 2018-05-02 07:29:20.923388 | 2018-05-02 07:29:20.923388 | Ø

Destroying those too:

PublicBody::Translation.where(name: nil, locale: 'cy').destroy_all
gbp commented 5 years ago

@gbp any idea why includes(:tags, :translations) was added in 2daee6c

Not certain why that is there. This could've been around the time I was importing in the production public_bodies and I would've seen a massive performance issue if I had missed the translation table - just a guess.

I'm not getting any spec failures removing the preload, but you're saying you are - how strange.

garethrees commented 5 years ago

Yeah a bit slower without by the looks of it:

Benchmark.bmbm do |performance|
  performance.report('includes   ') { PublicBody.without_request_email.includes(:tags, :translations).map(&:name) }
  performance.report('no-includes') { PublicBody.without_request_email.map(&:name) }
end
# Rehearsal -----------------------------------------------
# includes      0.100000   0.010000   0.110000 (  0.130291)
# no-includes   0.140000   0.010000   0.150000 (  0.215384)
# -------------------------------------- total: 0.260000sec
# 
#                   user     system      total        real
# includes      0.110000   0.000000   0.110000 (  0.131488)
# no-includes   0.140000   0.010000   0.150000 (  0.199661)

I'm not getting any spec failures removing the preload, but you're saying you are - how strange.

To be clear, I added a new spec to PublicBody.without_request_email in spec/models/public_body_spec.rb and commented/uncommented the includes call. I've updated https://github.com/mysociety/alaveteli/issues/5084#issuecomment-493866816 to be a little clearer.