edgeryders / discourse-annotator

A text annotation and analysis application for Discourse. Made with Annotator.js and Ruby on Rails.
https://edgeryders.eu/t/6811
Other
4 stars 0 forks source link

topic_id missing from some annotations? #199

Closed albertocottica closed 4 years ago

albertocottica commented 4 years ago

I have found this: https://edgeryders.eu/annotator/annotations/18625.json.

And this: https://edgeryders.eu/annotator/annotations/18551.json

And this: https://edgeryders.eu/annotator/annotations/18540.json

Each returns correctly the post_id, but the topic_id field shows as null.

Is this a one-off or should I adapt my code around it?

albertocottica commented 4 years ago

Definitely not a one-off. Here is the list of annotations IDs in POPREBEL who have topic_id : null:

[18625, 18624, 18551, 18548, 18547, 18546, 18545, 18540, 18537, 18536, 18535, 18531, 18523, 18508, 18505, 18502, 18497, 18496, 18491, 18487, 18482, 17426, 17421, 17418, 17415, 17413, 17119, 17105, 17081, 17034, 17025, 17018, 17011, 17008, 17007, 17006, 16999, 16992, 16985, 16982, 16970, 16967, 16963, 16961, 16960, 16935, 16922]

damingo commented 4 years ago

There were 84 annotations in total with topic_id : null. 47 of them I was able to fix as only the topic-id was not set but the topic was still available. For the remaining 37 annotations (listed below) the topics are no longer available as they were deleted in Discourse. Not sure what's best to handle annotations with deleted topics. It's rare as it is just about 40 annotations out of 20T. If possible, the easiest might be to simply ignore them on the client side.

https://edgeryders.eu/annotator/annotations/8720.json created: 2016-10-12 post_id: 658 https://edgeryders.eu/annotator/annotations/10305.json created: 2016-10-12 post_id: 658 https://edgeryders.eu/annotator/annotations/7036.json created: 2016-10-12 post_id: 658 https://edgeryders.eu/annotator/annotations/7509.json created: 2016-10-12 post_id: 658 https://edgeryders.eu/annotator/annotations/8967.json created: 2016-10-12 post_id: 658 https://edgeryders.eu/annotator/annotations/8303.json created: 2016-10-12 post_id: 658 https://edgeryders.eu/annotator/annotations/5740.json created: 2016-10-12 post_id: 658 https://edgeryders.eu/annotator/annotations/9199.json created: 2016-10-12 post_id: 658 https://edgeryders.eu/annotator/annotations/8864.json created: 2016-10-12 post_id: 658 https://edgeryders.eu/annotator/annotations/5419.json created: 2016-10-12 post_id: 658 https://edgeryders.eu/annotator/annotations/6968.json created: 2016-10-12 post_id: 658 https://edgeryders.eu/annotator/annotations/11394.json created: 2016-10-12 post_id: 658 https://edgeryders.eu/annotator/annotations/11602.json created: 2017-10-20 post_id: 658 https://edgeryders.eu/annotator/annotations/11603.json created: 2017-10-20 post_id: 658 https://edgeryders.eu/annotator/annotations/11604.json created: 2017-10-20 post_id: 658 https://edgeryders.eu/annotator/annotations/11605.json created: 2017-10-20 post_id: 658 https://edgeryders.eu/annotator/annotations/11606.json created: 2017-10-20 post_id: 658 https://edgeryders.eu/annotator/annotations/12146.json created: 2017-11-27 post_id: 36424 https://edgeryders.eu/annotator/annotations/12258.json created: 2017-11-27 post_id: 34974 https://edgeryders.eu/annotator/annotations/12259.json created: 2017-11-27 post_id: 34974 https://edgeryders.eu/annotator/annotations/12407.json created: 2017-11-28 post_id: 34172 https://edgeryders.eu/annotator/annotations/12408.json created: 2017-11-28 post_id: 34172 https://edgeryders.eu/annotator/annotations/12409.json created: 2017-11-28 post_id: 34172 https://edgeryders.eu/annotator/annotations/12410.json created: 2017-11-28 post_id: 34172 https://edgeryders.eu/annotator/annotations/12705.json created: 2017-11-28 post_id: 2158 https://edgeryders.eu/annotator/annotations/12706.json created: 2017-11-28 post_id: 2158 https://edgeryders.eu/annotator/annotations/13858.json created: 2018-07-27 post_id: 37972 https://edgeryders.eu/annotator/annotations/13859.json created: 2018-07-27 post_id: 38265 https://edgeryders.eu/annotator/annotations/13860.json created: 2018-07-27 post_id: 38265 https://edgeryders.eu/annotator/annotations/13861.json created: 2018-07-27 post_id: 37972 https://edgeryders.eu/annotator/annotations/15813.json created: 2019-06-27 post_id: 0 https://edgeryders.eu/annotator/annotations/15816.json created: 2019-06-28 post_id: 0 https://edgeryders.eu/annotator/annotations/15822.json created: 2019-06-28 post_id: 0 https://edgeryders.eu/annotator/annotations/15823.json created: 2019-06-28 post_id: 0 https://edgeryders.eu/annotator/annotations/15824.json created: 2019-06-28 post_id: 0 https://edgeryders.eu/annotator/annotations/15825.json created: 2019-06-28 post_id: 0 https://edgeryders.eu/annotator/annotations/15826.json created: 2019-06-28 post_id: 0

albertocottica commented 4 years ago

I see. @damingo, thanks for fixing what could be fixed. I don't think the rest are worth fixing. Do we have a place to put known issues? Also ping @tanius. Or do we simply delete them, since they are useless to us that way?

tanius commented 4 years ago

Or do we simply delete them, since they are useless to us that way?

In the interest of data integrity and because it's just about a few: yes, @damingo please delete them and close the issue. If it would be thousands, we'd have to think about other solutions, but this case is easy :slightly_smiling_face:

damingo commented 4 years ago

Done.