Closed stefanpauliuk closed 6 years ago
Should I make pull requests for the .xlsx files or just add my suggestions to this issue here? See also my suggestion https://github.com/IndEcol/IE_data_commons/issues/6.
I suggest adding the following UNIQUE
constraints: (more constraints added by S.P., are part of .xlsx master files on Sep 15, 2018)
Column constraints (each column(s) UNIQUE):
units
table: UNIQUE (unitcode), UNIQUE (unit_name)
(column constraints)users
table: UNIQUE (username)
(column constraint)lincences
table: UNIQUE (name)
(column constraint)source_type
table: UNIQUE (name)
(column constraint)dimensions
table: UNIQUE (name)
(column constraint)categories
table: UNIQUE (name)
(column constraint)types
table: UNIQUE (name), UNIQUE (symbol)
(column constraints)layers
table: UNIQUE (name)
(column constraint)aspects
table: UNIQUE (aspect), UNIQUE (index_letter)
(column constraints)provenance
table: UNIQUE (name)
(column constraint)project
table: UNIQUE (project_name)
(column constraint)datagroups
table: UNIQUE (datagroup_name)
(column constraint)classification_definitions
table: UNIQUE (classification_name)
, it is not possible to create the same custom classification twice, e.g. origin_process__1_F_steel_SankeyFlows_2008_GlobalTable constraints (across column(s): UNIQUE):
classification_items
table: UNIQUE (classification_id, attribute_1_oto)
. That would mean that the combination of classification_id and attribute can only exist once, i.e. no accidental addition of the same classification possible.datasets
table: UNIQUE (dataset_name, dataset_version)
:warning: Instead of adding new comments all the time I will just keep editing this one.
I suggest making the following changes to the classification_definitions
and classification_items
tables (cf. master files):
In the definitions table, add the 'general' column (TRUE if classification is in general use, e.g., chemical elements) and the 'created_from_dataset' column (TRUE if classification is defined by upload script and classification items are populated from dataset. (Full description: see master xlsx file).
In the classification_items table, rename the 'attribute1' to 'attribute4' columns to 'attribute1_oto' to 'attribute4_oto', where 'oto' stands for 'one-to-one', indicating that these four columns are reserved for attributes that form bijective descriptions of the classification items, e.g., chemical element names, atomic numbers, and symbols. Rename the 'attribute5' to 'attribute15' columns to 'attribute5_anc' to 'attribute15_anc' to indicate that these attribute do not need to be 1:1 descriptions of the items but can indicate other relations, such as the aggreation to broader regions or substance groups.
Looks good. No objections form my side. Should we close the issue for now?
1) in classification_definitions: classification_definition.reserve5: Switch to "CustomFlag": Set True (1) if classification was created from dataset on the fly.
2) in classification_definitions: classification_definition.reserve4: Switch to "BijectiveFlag": Set True (1) if the different attributes provided (if any) have a 1:1 relationship. TRUE for elements, for example (H, Hydrogen, 1 are 1:1:1), FALSE for iso_regions, as its attribute contain continents as well and Both China and Mongolia link to Asia.
3) Switch the following columns to UNIQUE: