davidskalinder / mpeds-coder

MPEDS Annotation Interface
MIT License
0 stars 0 forks source link

Check event edits #104

Closed davidskalinder closed 4 years ago

davidskalinder commented 4 years ago

From @olderwoman's agenda item in the weekly meeting:

Make sure we know what happens with MAI under the following circumstances: 1) An event is deleted. 2) There is a change in the description of the event. 3) Other characteristics of the event are changed. 4) An article shifts from having events to not having events. 5) An article shifts from having no events to having events. 6) what happens if there is no event and an article is deemed to not have any relevant events? Is that being passed to John?

If any of this stuff ends up being different than we expect, it might affect @johnklemke's import into pass 2...

davidskalinder commented 4 years ago

Okay, I'll reorder these items slightly since the last two set up the investigation of the others. I'll describe what I do in the UI and show the state of the DB after each action. Note that this doesn't quite show what @johnklemke will get in the MAI export: unless something has gone wrong, that file should contain (a wide-formatted version of) all the info from the DB except for the timestamps. I haven't included the contents of that file because it'll be harder to show what's changed at every step, but if there's something in there that we don't trust, let me know and I can walk through the whole process (or any part we want) again looking at that file's contents after each action.

First, here's the initial state of the DB (the last few lines of the tables that will change), before I've done anything:

mysql> SELECT * FROM coder_article_annotation ORDER BY timestamp DESC LIMIT 3;
+----+------------+-------------------+-----------+------+----------+---------------------+
| id | article_id | variable          | value     | text | coder_id | timestamp           |
+----+------------+-------------------+-----------+------+----------+---------------------+
| 39 |       4088 | article-desc      | qweqweqwe | NULL |        2 | 2020-07-07 15:03:02 |
| 13 |       4088 | article-uncertain |           | NULL |        2 | 2020-06-17 11:09:51 |
| 11 |       1663 | article-desc      |           | NULL |        2 | 2020-06-16 16:06:34 |
+----+------------+-------------------+-----------+------+----------+---------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM coder_event_creator ORDER BY timestamp DESC LIMIT 3;
+-----+------------+----------+----------------+-------------+--------------+----------+---------------------+
| id  | article_id | event_id | variable       | value       | text         | coder_id | timestamp           |
+-----+------------+----------+----------------+-------------+--------------+----------+---------------------+
| 836 |         75 |       68 | bystander-text | 1-94-99     | about        |        1 | 2020-07-14 17:42:00 |
| 835 |       5794 |       67 | date-est       | approximate | NULL         |        2 | 2020-07-13 14:56:49 |
| 833 |       5794 |       67 | actor-text     | 3-172-184   | Philadelphia |        2 | 2020-07-13 14:56:37 |
+-----+------------+----------+----------------+-------------+--------------+----------+---------------------+
3 rows in set (0.00 sec)

So, that's the pre-testing state: just stuff left over from whatever I was testing last week. Here goes with the items:

6) what happens if there is no event and an article is deemed to not have any relevant events?

I'll add an article description and mark the "No protests" checkbox:

mysql> SELECT * FROM coder_article_annotation ORDER BY timestamp DESC LIMIT 3;
+----+------------+--------------+--------------------+------+----------+---------------------+
| id | article_id | variable     | value              | text | coder_id | timestamp           |
+----+------------+--------------+--------------------+------+----------+---------------------+
| 40 |       1579 | article-desc | Test article descr | NULL |        2 | 2020-07-20 14:33:01 |
| 41 |       1579 | no-protests  | yes                | NULL |        2 | 2020-07-20 14:33:01 |
| 39 |       4088 | article-desc | qweqweqwe          | NULL |        2 | 2020-07-07 15:03:02 |
+----+------------+--------------+--------------------+------+----------+---------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM coder_event_creator ORDER BY timestamp DESC LIMIT 3;
+-----+------------+----------+----------------+-------------+--------------+----------+---------------------+
| id  | article_id | event_id | variable       | value       | text         | coder_id | timestamp           |
+-----+------------+----------+----------------+-------------+--------------+----------+---------------------+
| 836 |         75 |       68 | bystander-text | 1-94-99     | about        |        1 | 2020-07-14 17:42:00 |
| 835 |       5794 |       67 | date-est       | approximate | NULL         |        2 | 2020-07-13 14:56:49 |
| 833 |       5794 |       67 | actor-text     | 3-172-184   | Philadelphia |        2 | 2020-07-13 14:56:37 |
+-----+------------+----------+----------------+-------------+--------------+----------+---------------------+
3 rows in set (0.00 sec)

So, the article-level info has been correctly added to the article-level table and the event-level table hasn't changed.

Is that being passed to John?

Like I said above, it should be, although to make absolutely sure we'd have to run an export and check the file.

5) An article shifts from having no events to having events.

I'll add an event with nothing in it.

mysql> SELECT * FROM coder_article_annotation ORDER BY timestamp DESC LIMIT 3;
+----+------------+--------------+--------------------+------+----------+---------------------+
| id | article_id | variable     | value              | text | coder_id | timestamp           |
+----+------------+--------------+--------------------+------+----------+---------------------+
| 40 |       1579 | article-desc | Test article descr | NULL |        2 | 2020-07-20 14:33:01 |
| 41 |       1579 | no-protests  | yes                | NULL |        2 | 2020-07-20 14:33:01 |
| 39 |       4088 | article-desc | qweqweqwe          | NULL |        2 | 2020-07-07 15:03:02 |
+----+------------+--------------+--------------------+------+----------+---------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM coder_event_creator ORDER BY timestamp DESC LIMIT 3;
+-----+------------+----------+----------------+-------------+--------------+----------+---------------------+
| id  | article_id | event_id | variable       | value       | text         | coder_id | timestamp           |
+-----+------------+----------+----------------+-------------+--------------+----------+---------------------+
| 836 |         75 |       68 | bystander-text | 1-94-99     | about        |        1 | 2020-07-14 17:42:00 |
| 835 |       5794 |       67 | date-est       | approximate | NULL         |        2 | 2020-07-13 14:56:49 |
| 833 |       5794 |       67 | actor-text     | 3-172-184   | Philadelphia |        2 | 2020-07-13 14:56:37 |
+-----+------------+----------+----------------+-------------+--------------+----------+---------------------+
3 rows in set (0.00 sec)

No change anywhere that we care about. (NB: there is a table called event which as far as I can tell is redundant and has nothing but a list of article id / event id pairs. Adding an empty event does add a new event ID to this table, but neither the coder-table export or any of the ones I've built use it -- they only include events with any information attached to them.)

Now I'll add a description to the new event:

mysql> SELECT * FROM coder_article_annotation ORDER BY timestamp DESC LIMIT 3;
+----+------------+--------------+--------------------+------+----------+---------------------+
| id | article_id | variable     | value              | text | coder_id | timestamp           |
+----+------------+--------------+--------------------+------+----------+---------------------+
| 40 |       1579 | article-desc | Test article descr | NULL |        2 | 2020-07-20 14:33:01 |
| 41 |       1579 | no-protests  | yes                | NULL |        2 | 2020-07-20 14:33:01 |
| 39 |       4088 | article-desc | qweqweqwe          | NULL |        2 | 2020-07-07 15:03:02 |
+----+------------+--------------+--------------------+------+----------+---------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM coder_event_creator ORDER BY timestamp DESC LIMIT 3;
+-----+------------+----------+----------------+-------------------+-------+----------+---------------------+
| id  | article_id | event_id | variable       | value             | text  | coder_id | timestamp           |
+-----+------------+----------+----------------+-------------------+-------+----------+---------------------+
| 837 |       1579 |       70 | desc           | Test event descr. | NULL  |        2 | 2020-07-20 14:57:53 |
| 836 |         75 |       68 | bystander-text | 1-94-99           | about |        1 | 2020-07-14 17:42:00 |
| 835 |       5794 |       67 | date-est       | approximate       | NULL  |        2 | 2020-07-13 14:56:49 |
+-----+------------+----------+----------------+-------------------+-------+----------+---------------------+
3 rows in set (0.00 sec)

So, no change to the article table, new event description in the event table.

1) An event is deleted.

I'll refresh the page so that event number 70 shows up in the UI's "Event descriptions" list (doing this doesn't change any tables). Now I'll delete it:

mysql> SELECT * FROM coder_article_annotation ORDER BY timestamp DESC LIMIT 3;
+----+------------+--------------+--------------------+------+----------+---------------------+
| id | article_id | variable     | value              | text | coder_id | timestamp           |
+----+------------+--------------+--------------------+------+----------+---------------------+
| 40 |       1579 | article-desc | Test article descr | NULL |        2 | 2020-07-20 14:33:01 |
| 41 |       1579 | no-protests  | yes                | NULL |        2 | 2020-07-20 14:33:01 |
| 39 |       4088 | article-desc | qweqweqwe          | NULL |        2 | 2020-07-07 15:03:02 |
+----+------------+--------------+--------------------+------+----------+---------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM coder_event_creator ORDER BY timestamp DESC LIMIT 3;
+-----+------------+----------+----------------+-------------+--------------+----------+---------------------+
| id  | article_id | event_id | variable       | value       | text         | coder_id | timestamp           |
+-----+------------+----------+----------------+-------------+--------------+----------+---------------------+
| 836 |         75 |       68 | bystander-text | 1-94-99     | about        |        1 | 2020-07-14 17:42:00 |
| 835 |       5794 |       67 | date-est       | approximate | NULL         |        2 | 2020-07-13 14:56:49 |
| 833 |       5794 |       67 | actor-text     | 3-172-184   | Philadelphia |        2 | 2020-07-13 14:56:37 |
+-----+------------+----------+----------------+-------------+--------------+----------+---------------------+
3 rows in set (0.00 sec)

So no change to the article table and all (one) lines with an event_id of 70 are gone from the event-level table.

2) There is a change in the description of the event.

Since I just deleted the only event for this article, I'll add a new one, with a new description, now:

mysql> SELECT * FROM coder_article_annotation ORDER BY timestamp DESC LIMIT 3;
+----+------------+--------------+--------------------+------+----------+---------------------+
| id | article_id | variable     | value              | text | coder_id | timestamp           |
+----+------------+--------------+--------------------+------+----------+---------------------+
| 40 |       1579 | article-desc | Test article descr | NULL |        2 | 2020-07-20 14:33:01 |
| 41 |       1579 | no-protests  | yes                | NULL |        2 | 2020-07-20 14:33:01 |
| 39 |       4088 | article-desc | qweqweqwe          | NULL |        2 | 2020-07-07 15:03:02 |
+----+------------+--------------+--------------------+------+----------+---------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM coder_event_creator ORDER BY timestamp DESC LIMIT 3;
+-----+------------+----------+----------------+--------------------+-------+----------+---------------------+
| id  | article_id | event_id | variable       | value              | text  | coder_id | timestamp           |
+-----+------------+----------+----------------+--------------------+-------+----------+---------------------+
| 840 |       1579 |       71 | desc           | Another test event | NULL  |        2 | 2020-07-20 15:04:08 |
| 836 |         75 |       68 | bystander-text | 1-94-99            | about |        1 | 2020-07-14 17:42:00 |
| 835 |       5794 |       67 | date-est       | approximate        | NULL  |        2 | 2020-07-13 14:56:49 |
+-----+------------+----------+----------------+--------------------+-------+----------+---------------------+
3 rows in set (0.00 sec)

Now I'll change the description:

mysql> SELECT * FROM coder_article_annotation ORDER BY timestamp DESC LIMIT 3;
+----+------------+--------------+--------------------+------+----------+---------------------+
| id | article_id | variable     | value              | text | coder_id | timestamp           |
+----+------------+--------------+--------------------+------+----------+---------------------+
| 40 |       1579 | article-desc | Test article descr | NULL |        2 | 2020-07-20 14:33:01 |
| 41 |       1579 | no-protests  | yes                | NULL |        2 | 2020-07-20 14:33:01 |
| 39 |       4088 | article-desc | qweqweqwe          | NULL |        2 | 2020-07-07 15:03:02 |
+----+------------+--------------+--------------------+------+----------+---------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM coder_event_creator ORDER BY timestamp DESC LIMIT 3;
+-----+------------+----------+----------------+-----------------------------------+-------+----------+---------------------+
| id  | article_id | event_id | variable       | value                             | text  | coder_id | timestamp           |
+-----+------------+----------+----------------+-----------------------------------+-------+----------+---------------------+
| 842 |       1579 |       71 | desc           | An entirely different description | NULL  |        2 | 2020-07-20 15:05:10 |
| 836 |         75 |       68 | bystander-text | 1-94-99                           | about |        1 | 2020-07-14 17:42:00 |
| 835 |       5794 |       67 | date-est       | approximate                       | NULL  |        2 | 2020-07-13 14:56:49 |
+-----+------------+----------+----------------+-----------------------------------+-------+----------+---------------------+
3 rows in set (0.00 sec)

So as expected, the event description changes and nothing else does.

3) Other characteristics of the event are changed.

First I'll add another characteristic. How about a start date:

mysql> SELECT * FROM coder_article_annotation ORDER BY timestamp DESC LIMIT 3;
+----+------------+--------------+--------------------+------+----------+---------------------+
| id | article_id | variable     | value              | text | coder_id | timestamp           |
+----+------------+--------------+--------------------+------+----------+---------------------+
| 40 |       1579 | article-desc | Test article descr | NULL |        2 | 2020-07-20 14:33:01 |
| 41 |       1579 | no-protests  | yes                | NULL |        2 | 2020-07-20 14:33:01 |
| 39 |       4088 | article-desc | qweqweqwe          | NULL |        2 | 2020-07-07 15:03:02 |
+----+------------+--------------+--------------------+------+----------+---------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM coder_event_creator ORDER BY timestamp DESC LIMIT 3;
+-----+------------+----------+----------------+-----------------------------------+-------+----------+---------------------+
| id  | article_id | event_id | variable       | value                             | text  | coder_id | timestamp           |
+-----+------------+----------+----------------+-----------------------------------+-------+----------+---------------------+
| 844 |       1579 |       71 | start-date     | 2020-07-01                        | NULL  |        2 | 2020-07-20 15:06:31 |
| 843 |       1579 |       71 | desc           | An entirely different description | NULL  |        2 | 2020-07-20 15:06:26 |
| 836 |         75 |       68 | bystander-text | 1-94-99                           | about |        1 | 2020-07-14 17:42:00 |
+-----+------------+----------+----------------+-----------------------------------+-------+----------+---------------------+
3 rows in set (0.00 sec)

... and now I'll change that start date from July 1 to July 2:

mysql> SELECT * FROM coder_article_annotation ORDER BY timestamp DESC LIMIT 3;
+----+------------+--------------+--------------------+------+----------+---------------------+
| id | article_id | variable     | value              | text | coder_id | timestamp           |
+----+------------+--------------+--------------------+------+----------+---------------------+
| 40 |       1579 | article-desc | Test article descr | NULL |        2 | 2020-07-20 14:33:01 |
| 41 |       1579 | no-protests  | yes                | NULL |        2 | 2020-07-20 14:33:01 |
| 39 |       4088 | article-desc | qweqweqwe          | NULL |        2 | 2020-07-07 15:03:02 |
+----+------------+--------------+--------------------+------+----------+---------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM coder_event_creator ORDER BY timestamp DESC LIMIT 3;
+-----+------------+----------+----------------+-----------------------------------+-------+----------+---------------------+
| id  | article_id | event_id | variable       | value                             | text  | coder_id | timestamp           |
+-----+------------+----------+----------------+-----------------------------------+-------+----------+---------------------+
| 845 |       1579 |       71 | start-date     | 2020-07-02                        | NULL  |        2 | 2020-07-20 15:07:39 |
| 843 |       1579 |       71 | desc           | An entirely different description | NULL  |        2 | 2020-07-20 15:06:26 |
| 836 |         75 |       68 | bystander-text | 1-94-99                           | about |        1 | 2020-07-14 17:42:00 |
+-----+------------+----------+----------------+-----------------------------------+-------+----------+---------------------+
3 rows in set (0.00 sec)

So again, there's a change where we expect it and nowhere else.

4) An article shifts from having events to not having events.

This is covered in 1) above I think?

So I think all of that is working as expected; and like I said above, any exports run at any stage in this sequence should contain the state of the database at that point (and nothing else). And as we discussed this morning, I think the pass 2 system will have to figure out how to handle staying in synch with the changes while maintaining any pass 2 work that has been based on old states.

@olderwoman / @johnklemke, I think this is all I can do for this for now, but let me know if there's any other info you need? I'll move the issue to my user testing / QA column until you want me to close it.

davidskalinder commented 4 years ago

I think no news is good news on this one, and since there's no code changes there's nothing that needs paranoia-testing; so I'm going to close this.