Open alukach opened 3 months ago
If it is not difficult to throw an error I think that would be better. In these cases we could disable the study until the ingestion is complete and successful.
I would vote for always assuming the data is complete and consistent, and fail if not. The user will then have to edit their data until the ingestion works. This is almost like a validation at ingestion type of thing. Otherwise there is a high risk to ingest unexpected data that might be inconsistent without the user noticing.
I will ask Ricardo to review these cases in the municipal data.
Our system works by normalizing metrics into unique combinations of
category
,usage
, andsource
. We look to themetrics_metadata
table to determine which fields it themetrics
table can be used to represent a givencategory
,usage
, andsource
datapoint for a given scenario.For example, if we are interested in the total cost for a study, that would be represented by
Category=Cost
,Usage=
,Source=
(where a blank value implies all). To determine which field we would render for this data on the baseline scenario, we can look to ourmetrics_metadata
table and find the row that contains a blankscenario
column, acategory
column with a value ofCost
, and a blankusage
andsource
column. To find out how any given scenario would affect that value, we would look for a row with ascenario
column matching the scenario of interest, acategory
column with a value ofCost
, and a blankusage
andsource
column.However, we are noticing some data inconsistencies when reviewing Municipal Data v4:
category=Cost
,usage=
,source=
.category=Cost
,usage=
,source=
This brings up the following question: