infinite-industries / infinite

MIT License
5 stars 3 forks source link

Review venue data for issues, and fix manually #317

Closed rlmead closed 1 year ago

rlmead commented 1 year ago

Remaining tasks related to this work as of 3/19/23:


After we have cleaned up and reworked venue location data, we can manually review location information in the database and/or at infinite.industries/admin-venue-edit. Based on what we know of the data, we might find problems like:

The following information is outdated but may still be helpful. There is also related information about bad Google Maps URLs in the archived issue #316.

You can find a list of venues that need corrected addresses in the incomplete_addresses sheet of this spreadsheet.

The list in that spreadsheet corresponds to the output from this query as of 12/4/22:

select name, address from venues where not address like '%,%';

The list returned here may not be comprehensive - could be worth doing some other data validation checks, too. We'll probably find more buggy data as we work through #318.

rlmead commented 1 year ago

Notes on cleanup, remaining problems, and future considerations here

Queries used to find bad data (probably not comprehensive):

select name, address from venues where state != 'Kentucky';
select id, name, g_map_link from venues where g_map_link not like 'https://goo%' and g_map_link not like 'https://www.google.com%' and g_map_link not like 'https://g.page%';
select name, address from venues where city is null;
select name, address from venues where state is null;
rlmead commented 1 year ago

Duplicate venue cleanup

Get list of duplicate venues and the number of events that links to each:

infinite-api=# select count(*) as num_events, venues.id, venues.name, venues.address from events
full outer join venues on events.venue_id = venues.id
where venues.id in ('7ba33ec9-ca8e-4700-97a5-d202acea7b0e', 'eb3b9bbc-e804-463d-84cb-1752f0dce43c', '2c227750-4871-4b5f-937e-8f848302f894', '030f6741-5e1c-4651-ad45-e0f1ac4f5e4b', '81bdf6b0-8511-11ea-813a-37e204d97a42', '4f914d60-b922-11e9-b43e-456ed9f9b823', 'e48b9d30-d6ae-11ea-b6c2-ad89bfbe302e', 'e592ac95-b63a-4439-a3ae-a282532ff448', '597ba75c-4832-41ea-9106-01f6cfd5b88c', '57045000-3200-11e9-b43e-456ed9f9b823', '2d74ad30-3475-11e9-b43e-456ed9f9b823', '5623b730-3475-11e9-b43e-456ed9f9b823', '8f5c7260-047a-11e8-908e-07b6d0fe598a', '550afaa5-7998-4b6c-a74e-005ac84ea74e')
group by venues.id order by name;
 num_events |                  id                  |                               name                                |                     address
------------+--------------------------------------+-------------------------------------------------------------------+-------------------------------------------------
          1 | 7ba33ec9-ca8e-4700-97a5-d202acea7b0e | Country Boy Brewing                                               | 436 Chair Avenue, Lexington, Kentucky 40508
          0 | eb3b9bbc-e804-463d-84cb-1752f0dce43c | Country Boy Brewing                                               | 436 Chair Avenue, Lexington, 40508
          2 | 57045000-3200-11e9-b43e-456ed9f9b823 | Lexington Public Library (Central Library)                        | 140 East Main Street, Lexington, Kentucky 40507
         10 | 597ba75c-4832-41ea-9106-01f6cfd5b88c | Lexington Public Library (Central)                                | 140 East Main Street, Lexington, Kentucky 40507
          1 | 2d74ad30-3475-11e9-b43e-456ed9f9b823 | Lexington Public Library - Tates Creek Branch                     | 3628 Walden Drive, Lexington, Kentucky 40517
          1 | 5623b730-3475-11e9-b43e-456ed9f9b823 | Lexington Public Library - Tates Creek Branch                     | 3628 Walden Drive, Lexington, 40517,
          1 | 2c227750-4871-4b5f-937e-8f848302f894 | Margaret I. King Library                                          | 179 Funkhouser Drive, Lexington, Kentucky 40506
          1 | 550afaa5-7998-4b6c-a74e-005ac84ea74e | Student Center Cinema                                             | 160, Ave of Champions, Lexington 40508
         11 | 8f5c7260-047a-11e8-908e-07b6d0fe598a | The Gatton Student Center Cinema                                  | 160 Avenue of Champions, Lexington, Kentucky 40526
          1 | 030f6741-5e1c-4651-ad45-e0f1ac4f5e4b | UK Special Collections Research Center - Margaret I. King Library | 179 Funkhouser Drive, Lexington, Kentucky 40506
          1 | 81bdf6b0-8511-11ea-813a-37e204d97a42 | WRFL                                                              | Lexington, Lexington, Kentucky
          2 | 4f914d60-b922-11e9-b43e-456ed9f9b823 | WRFL-LIVE                                                         | Lexington, Kentucky
          4 | e48b9d30-d6ae-11ea-b6c2-ad89bfbe302e | WUKY                                                              | https://www.wuky.org/
          1 | e592ac95-b63a-4439-a3ae-a282532ff448 | WUKY Studios                                                      | 2640 Spurr Road, Lexington, Kentucky 40511
(8 rows)

Checks & updates for soft-deleting duplicate venues:

// Country Boy (might have no events, but run just in case)
update events set venue_id = '7ba33ec9-ca8e-4700-97a5-d202acea7b0e' where venue_id = 'eb3b9bbc-e804-463d-84cb-1752f0dce43c';

// Central Library
update events set venue_id = '597ba75c-4832-41ea-9106-01f6cfd5b88c' where venue_id = '57045000-3200-11e9-b43e-456ed9f9b823';

// Tates Creek Library (might have no events)
update events set venue_id = '2d74ad30-3475-11e9-b43e-456ed9f9b823' where venue_id = '5623b730-3475-11e9-b43e-456ed9f9b823';

// Gatton Student Center Cinema (might have no events)
update events set venue_id = '8f5c7260-047a-11e8-908e-07b6d0fe598a' where venue_id = '550afaa5-7998-4b6c-a74e-005ac84ea74e';

// Margaret I. King Library
update events set venue_id = '2c227750-4871-4b5f-937e-8f848302f894' where venue_id = '030f6741-5e1c-4651-ad45-e0f1ac4f5e4b';

// WRFL
update events set venue_id = '81bdf6b0-8511-11ea-813a-37e204d97a42' where venue_id = '4f914d60-b922-11e9-b43e-456ed9f9b823';

// WUKY
update events set venue_id = 'e48b9d30-d6ae-11ea-b6c2-ad89bfbe302e' where venue_id = 'e592ac95-b63a-4439-a3ae-a282532ff448';

// double-check events before soft-deleting venues - should be 0
select count(*) from events where venue_id in ('eb3b9bbc-e804-463d-84cb-1752f0dce43c', '030f6741-5e1c-4651-ad45-e0f1ac4f5e4b', '4f914d60-b922-11e9-b43e-456ed9f9b823', 'e592ac95-b63a-4439-a3ae-a282532ff448', '57045000-3200-11e9-b43e-456ed9f9b823', '5623b730-3475-11e9-b43e-456ed9f9b823', '550afaa5-7998-4b6c-a74e-005ac84ea74e');

// soft-delete duplicate venues
update venues set is_soft_deleted = 't' where id in ('eb3b9bbc-e804-463d-84cb-1752f0dce43c', '030f6741-5e1c-4651-ad45-e0f1ac4f5e4b', '4f914d60-b922-11e9-b43e-456ed9f9b823', 'e592ac95-b63a-4439-a3ae-a282532ff448', '57045000-3200-11e9-b43e-456ed9f9b823', '5623b730-3475-11e9-b43e-456ed9f9b823', '550afaa5-7998-4b6c-a74e-005ac84ea74e');

// also soft-delete old North Lime Donuts location
update venues set is_soft_deleted = 't' where id = '31691000-c83d-11e9-b43e-456ed9f9b823';