MachineVisionUiB / machinevision

We are developing a database to map and interpret the representations and uses of machine vision technologies in digital art, computer games and narratives such as science fiction novels, movies and creepypasta.
http://uib.no/en/machinevision
4 stars 0 forks source link

Rename headers in export to avoid spaces and increase clarity #167

Closed jilltxt closed 2 years ago

jilltxt commented 2 years ago

We need to

CreativeWorks_long.csv / CreativeWorks_wide.csv

Old name and order New name New order if changed
Title WorkTitle WorkID
ID WorkID WorkTitle
Year Year
Country Country
Genre Genre
"Genre ID" skip
"Technologies referenced" TechRef
"Technologies referenced id" skip
"Technologies used" TechUsed
"Technologies used id" skip
Topic Topic
"Topic id" skip
Sentiment Sentiment
"Sentiment id" skip
"Situation machine vision is used in" Situation
"Situation machine vision is used in ID" SituationID
Characters Character
"Characters ID" CharacterID

Situations_long.csv / Situations_long.csv

Old name and order New name New order if changed
Title SituationTitle SituationID
ID SituationID SituationTitle
"Publication Type" Genre
"Publication Type ID" skip  
"This character" Character
"This character ID" CharacterID
Entity Entity
"Entity ID" skip
Technology Technology
"Technology ID" skip
Verb Verb
"Verb ID" skip

NarrativeGenres.csv

Jill fixed this - all done. Note: Remove GenreID from this export as we do not need it.

Old name and order New name New order
Title WorkTitle WorkID
ID WorkID WorkTitle
Genre Genre
Genre ID skip

Characters.csv

Jill successfully rewrote Characters_long.csv and committed to the repo as Characters.csv. All done. Notes

Old name and order New name New order
Character Character CharacterID
ID CharacterID Character
Race/ethnicity RaceOrEthnicity Species
Gender Gender Gender
Species Species RaceOrEthnicity
Age Age Age
Sexuality Sexuality Sexuality
IsGroup IsGroup
IsCustomizable IsCustomizable 
jilltxt commented 2 years ago

I'm assigning myself to update this in the Data in Brief paper, but will need @steinmb to fix the export.

jilltxt commented 2 years ago

After adding missing character data as described in #166, I made the necessary changes to headers in the data export for Characters_long and created a revised data export for Characters called Characters V3 (Download from.)

I uploaded Characters to the Github repository for the datavisualisations - go ahead and download it from there.

steinmb commented 2 years ago

@jilltxt verifying Situations and have one Q:

> Foo <- read.csv("./Data/situations_long.csv")
> head(Foo)
  SituationID                                                      SituationTitle     Genre GenreID                  Character Entity Technology       Verb
1        3269 Star Wars: Episode VII - The Force Awakens (Holograms for planning) Narrative   20001 Leia Skywalker Organa Solo                     Informed
2        3269 Star Wars: Episode VII - The Force Awakens (Holograms for planning)     Movie 2000130 Leia Skywalker Organa Solo                   Discussing
3        3269 Star Wars: Episode VII - The Force Awakens (Holograms for planning) Narrative   20001                   Han Solo                     Learning
4        3269 Star Wars: Episode VII - The Force Awakens (Holograms for planning) Narrative   20001                      C-3PO                     Informed
5        3269 Star Wars: Episode VII - The Force Awakens (Holograms for planning)     Movie 2000130                      C-3PO                   Discussing
6        3269 Star Wars: Episode VII - The Force Awakens (Holograms for planning)     Movie 2000130                                    Holograms    Mapping

The spec in the issue summary does not list this as a wanted change, but, should not GenreID col. come before Genre? I was under the impressions that all id that we need should be listed first?

jilltxt commented 2 years ago

Yes, it would be nice if GenreID comes before Genre. Thanks for spotting that!!

steinmb commented 2 years ago

Sort order. Agree, we should standardise on something:

As mention in the issue summary we could sort by publication date. Only creative work have this (Indicate the year of first publication.) year - Perhaps sort work by the year info?

jilltxt commented 2 years ago

Yes, sort CreativeWorks by the year, that'd be great! Maybe just sort the others alphabetically by the title?

steinmb commented 2 years ago

Sounds good. Then we sort in the following order:

  1. Year
  2. Title
  3. ID
steinmb commented 2 years ago

Test exports:

> Foo <- read.csv("./Data/creativeworks_long.csv")
> head(Foo)
  WorkID   WorkTitle Year Country     Genre TechRef  TechUsed                 Topic Sentiment                        Situation SituationID Character CharacterID
1   3194    comeback   NA  Turkey       Art         Filtering                Nudity  Exciting comeback (rejected by Instagram)        3195                    NA
2   3194    comeback   NA  Turkey       Art         Filtering                Nudity    Flawed comeback (rejected by Instagram)        3195                    NA
3   3194    comeback   NA  Turkey       Art         Filtering          Social Media  Exciting comeback (rejected by Instagram)        3195                    NA
4   3194    comeback   NA  Turkey       Art         Filtering          Social Media    Flawed comeback (rejected by Instagram)        3195                    NA
5    164 Trette Menn 1891  Norway Narrative                   Romantic relationship  Exciting        Trette menn (Skjermbrett)        2866                    NA
6    164 Trette Menn 1891  Norway Narrative                   Romantic relationship   Helpful        Trette menn (Skjermbrett)        2866                    NA

Datavis repo

acb1f8a (HEAD -> main, origin/main) Creative works export
d55e0a3 Updated situations with reduced data set
9027a06 Updated situations export
b36f326 Updated situations wide export
8e4959c Filename type
b29cb21 Situation data export updated
fbb7507 Characters filename have been normalized
4aebdea Do not use capital letters in filename, if not needed
037632e Characters wide export
518f59e New Characters_long data export
3b46a75 Jill v3 updates moved

Drupal config

84b72d3 (HEAD -> master, origin/master, origin/HEAD) Creative work sort order
b556a9a Generic reduced creative work data set exports
0c2e7b7 Make sure creative works v3 do not collide with older versions
4988e36 Situations wide reduced data set
0709d86 Fix typo in exported filenames
b71929e Situations views new labels and order
8cf3983 Do not include version number in filename
cb8a9f3 Situations views new titles
5698134 Do not add export version to file name
ec74a47 Minor changes to Character V3
f37db92 Allow Drupal and composer installers to run
057d3e7 Update drupal core drupal/core (9.3.2)
dd843c3 Jill - Data export v3 work
steinmb commented 2 years ago

All data exports been updated. Note there is a new data structure in Datavis repo.

.
|-- Cleaned
|   |-- Gephi\ ready
|   |   `-- Gephi-Ready-Game-Situation-Agent-Verb.xlsx
|   `-- Master-Doc-Game-Situation-Agent-Verb.xlsx
|-- README.md
|-- data
|   |-- README.txt
|   |-- processed
|   |   |-- NarrativeGenres.csv
|   |   `-- characters.csv
|   `-- raw
|       |-- characters_long.csv
|       |-- characters_wide.csv
|       |-- creativeworks_long.csv
|       |-- creativeworks_wide.csv
|       |-- narrativegenres_long.csv
|       |-- narrativegenres_wide.csv
|       |-- situations_long.csv
|       `-- situations_wide.csv
|-- database_relation_map.png
|-- original_data_export
|-- output
|   `-- Characters_Description.html
`-- src
    |-- Characters_Description.Rmd
    |-- Define_fixed_vocabularies.R
    |-- Read_Characters_Make_Node_Table.R
    |-- XML2.R
    `-- characters.xml
steinmb commented 2 years ago

I'll going to dbl. check all the data exports views to make sure I got all changes.