Display-Lab / bit-stomach

Data ingest and performer annotation
1 stars 1 forks source link

Determine requirement for identity columns explicitly named in spek. #32

Closed grosscol closed 5 years ago

grosscol commented 5 years ago

Either require identification columns in spek or try and handle distill_annotations without knowing id column(s)

Background

Currently distilling the annotations requires an assumption about the id column. The id column is constructed by canonicalize_ids when the spek explicitly states which columns are used for identity.

Distilling annotations makes a data frame with columns: id, disposition, value. This is required for attributing a disposition to a performer.

  annotations %>%
    tidyr::gather(key = "disposition", value = "value", -id) %>%
    filter(value == T) %>%
    select(-value) %>%
    mutate(disposition=value_listify(disposition))

Choices

Either way, warn if spek does not contain identity ColumnUse in spek.