MachineVisionUiB / machinevision

We are developing a database to map and interpret the representations and uses of machine vision technologies in digital art, computer games and narratives such as science fiction novels, movies and creepypasta.
http://uib.no/en/machinevision
4 stars 0 forks source link

Duplicate IDs among verbs #165

Closed LindaKairus closed 2 years ago

LindaKairus commented 2 years ago

It seams like there is still duplicates in the Verb ID's and also some verbs have several ID's here is a list of duplicates that I have identified:

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

Duplicat Ids |   |   | Duplicat verbs |   -- | -- | -- | -- | -- Killed | 2000346 |   | Killed | 2000346 |   Killing | 2000346 |   | Killed | 2000723 |   Classifying | 2000367 |   | Killing | 2000344 |   Controlling | 2000402 |   | Killing | 2000346 |   Joking | 2000402 |   | Killing | 2000723 |   Manipulating | 2000402 |   | Classifying | 2000367 |   Protecting | 2000411 |   | Classifying | 2000759 |   Warning | 2000411 |   | Controlling | 2000402 |   Participating | 2000434 |   | Controlling | 2000440 |   Tricked | 2000434 |   | Controlling | 2000527 |   Evading | 2000437 |   | Controlling | 2000759 |   Inciting | 2000437 |   | Joking | 2000402 |   Countersurveilling | 2000438 |   | Joking | 2000528 |   Evading | 2000438 |   | Manipulating | 2000402 |   Countersurveilling | 2000439 |   | Manipulating | 2000486 |   Controlling | 2000440 |   | Protecting | 2000411 |   Identifying | 2000440 |   | Protecting | 2000440 |   Inciting | 2000440 |   | Warning | 2000411 |   Protecting | 2000440 |   | Warning | 2000884 |   Decorating | 2000453 |   | Participating | 2000434 |   Promoting | 2000453 |   | Participating | 2000629 |   Embellishing | 2000455 |   | Tricked | 2000434 |   Informing | 2000455 |   | Tricked | 2000723 |   Creating | 2000456 |   | Inciting | 2000437 |   Informing | 2000456 |   | Inciting | 2000440 |   Stalking | 2000617 |   | Countersurveilling | 2000438 |   Targeting | 2000617 |   | Countersurveilling | 2000439 |   Participating | 2000629 |   | Evading | 2000437 |   Viewed | 2000629 |   | Evading | 2000438 |   Killed | 2000723 |   | Identifying | 2000354 |   Killing | 2000723 |   | Identifying | 2000440 |   Tricked | 2000723 |   | Protecting | 2000411 |   Classifying | 2000759 |   | Protecting | 2000440 |   Controlling | 2000759 |   | Decorating | 2000448 |   Aware | 2000779 |   | Decorating | 2000452 |   Viewed | 2000779 |   | Decorating | 2000453 |     |   |   | Embellishing | 2000454 |     |   |   | Embellishing | 2000455 |     |   |   | Informing | 2000449 |     |   |   | Informing | 2000455 |     |   |   | Informing | 2000456 |     |   |   | Stalking | 2000617 |     |   |   | Stalking | 2000677 |     |   |   | Targeting | 2000617 |     |   |   | Targeting | 2000879 |     |   |   | Viewed | 2000629 |     |   |   | Viewed | 2000779 |     |   |   | Aware | 2000653 |     |   |   | Aware | 2000779 |  

LindaKairus commented 2 years ago

I realized that they existed when I tried to import my Nodes list into Gephi and got duplicate errors. I downloaded the sitiuations_long_v2 (1).csv one more time, imported it to excell, I did not remove duplicates or anything just sorted and search within the file and the same duplicates appear

Screenshot 2021-12-16 at 14 13 18

.

LindaKairus commented 2 years ago

Is there something going wrong in the export? Can this be fixed in the export or do we need to clean it manually?

steinmb commented 2 years ago

Hei Linda Thanks for bringing this to my attention. I'll have a look at both the data and the exports them self. This is from sitiuations_long_v2.csv?

LindaKairus commented 2 years ago

Yes, it is sitiuations_long_v2.csv and I double checked with sitiuations_long_v2 (1).csv and it has the same problem. Thank you for having a look at it.

steinmb commented 2 years ago

First I suspected that it was the ID, where we are padding the term ID that randomly failed but trying to run the export with the UUID points out that the ID is right, it fails getting the right term id, (verb title)

Debugging

Screenshot 2021-12-16 at 19 56 57

jilltxt commented 2 years ago

Closing this as it doesn't matter since we're no longer exporting the Verb ID.