usc-isi-i2 / datamart-api

MIT License
1 stars 2 forks source link

name vs description in variable metadata #17

Closed saggu closed 4 years ago

saggu commented 4 years ago

metadata/datasets/FSI/variables returns

[
  {
    "variable_id": "c1_security_apparatus",
    "dataset_id": "FSI"
  },
  {
    "variable_id": "c2_factionalized_elites",
    "dataset_id": "FSI"
  },
  {
    "variable_id": "c3_group_grievance",
    "dataset_id": "FSI"
  },
  {
    "variable_id": "e1_economy",
    "dataset_id": "FSI"
  }
]

NO name or description

but /metadata/datasets/FSI/variables/e1_economy returns

{
  "variable_id": "e1_economy",
  "dataset_id": "FSI",
  "description": "E1: Economy in FSI",
  "corresponds_to_property": "PFSI-005",
  "qualifier": [
    {
      "identifier": "P585",
      "name": "point in time"
    },
    {
      "identifier": "P248",
      "name": "stated in"
    }
  ]
}

NOTE that it has the description.

This is an issue Brandon raised

Another thing,

metadata/datasets/FSI/variables returns

[
  {
    "name": "_2005 PPP conversion factor, GDP (LCU per international $)",
    "variable_id": "_2005_ppp_conversion_factor_gdp_lcu_per_international",
    "dataset_id": "WDI"
  },
  {
    "name": "_2005 PPP conversion factor, private consumption (LCU per international $)",
    "variable_id": "_2005_ppp_conversion_factor_private_consumption_lcu_per_international",
    "dataset_id": "WDI"
  },
  {
    "name": "Access to clean fuels and technologies for cooking (% of population)",
    "variable_id": "access_to_clean_fuels_and_technologies_for_cooking_of_population",
    "dataset_id": "WDI"
  },
  {
    "name": "Access to electricity (% of population)",
    "variable_id": "access_to_electricity_of_population",
    "dataset_id": "WDI"
  }
]

IT has name

So what is the difference between name and description? Was the data for FSI incorrectly processed ?

kyao commented 4 years ago

From the code, metadata/datasets/{dataset_id}/variables is supposed to return name, dataset_id and variable_id. The FSI dataset is missing name for some reason. I'll look into it.

kyao commented 4 years ago

In addition to FSI, the OECD and WGI datasets are also missing the name field. The other Causeex datasets have names.

kyao commented 4 years ago

All the edge files for those three datasets have P1476 the name property for the variables.

$ grep QFSI-002 ~/Downloads/fsi-datamart-kgtk-exploded-uniq-ids.tsv 
QFSI-002-label  QFSI-002        label   "C1: Security Apparatus"        string  True    0                                               "C1: Security Apparatus"
QFSI-002-P1476  QFSI-002        P1476   "C1: Security Apparatus"        string  True    0                                               "C1: Security Apparatus"
QFSI-002-description    QFSI-002        description     "C1: Security Apparatus in FSI" string  True    0                                               "C1: Security Apparatus in FSI"
QFSI-002-P31-1  QFSI-002        P31     Q50701  symbol  True    0                                                                                                               Q50701
QFSI-002-P1687-1        QFSI-002        P1687   PFSI-002        symbol  True    0                                                                                                               PFSI-002
QFSI-002-P2006020002-P585       QFSI-002        P2006020002     P585    symbol  True    0                                                                                                               P585
QFSI-002-P2006020002-P248       QFSI-002        P2006020002     P248    symbol  True    0                                                                                                               P248
QFSI-002-P2006020004-1  QFSI-002        P2006020004     QFSI    symbol  True    0                                                                                                               QFSI
QFSI-002-P1813  QFSI-002        P1813   c1_security_apparatus   symbol  True    0                                                                                                               c1_security_apparatus
QFSI-P2006020003-QFSI-002       QFSI    P2006020003     QFSI-002        symbol  True    0                                                                                                               QFSI-002

Those three datasets were the first three loaded. Maybe we had a bug back then in our system when we uploaded them.

I'll try reloading them tomorrow.

zmbq commented 4 years ago

I'm unassigning myself from this bug, as it seems like a problem with the particular edges in the dataset, and not something with the queries.

kyao commented 4 years ago

Re-importing the tsv files for those three datasets solved the problem. Now, they are returning the name field. I just uploaded new datamart.sql.gz file for creating a new datamart-postgres-volume. Or, importing those three tsv files using import_tsv_postgres command works, too.

saggu commented 4 years ago

I will test it and deploy it for WM and other teams

saggu commented 4 years ago

Fixed in the latest database backup