PopulateTools / gobierto

Plataforma de gobierno abierto open source
https://gobierto.es
GNU Affero General Public License v3.0
74 stars 32 forks source link

Data / Appending data to an existing dataset with custom schema fails #2957

Closed ferblape closed 4 years ago

ferblape commented 4 years ago

If you create a dataset with a custom schema any other further request to update the dataset (using append option) will overwrite the schema. I'd expect the schema to don't be overwritten by an action that is only updating the data.

Steps to reproduce:

  1. define $API_TOKEN_ADMIN env var

  2. Create the dataset using a custom schema:

ruby $DEV_DIR/gobierto-etl-utils/operations/gobierto_data/upload-dataset/run.rb --api-token $API_TOKEN_ADMIN --gobierto-url http://madrid.gobierto.test --name "Calidad aire" --slug calidad-aire --table-name calidad_aire --csv-separator=';' --file-path=data.csv --schema-path=$DEV_DIR/gobierto-etl-datos/datasets/calidad_del_aire_madrid/schema_create.json
  1. Update the dataset with today data (using append option):
ruby $DEV_DIR/gobierto-etl-utils/operations/gobierto_data/upload-dataset/run.rb --api-token $API_TOKEN_ADMIN --gobierto-url http://madrid.gobierto.test --name "Calidad aire" --slug calidad-aire --table-name calidad_aire --file-path=daily_data.csv --append 

In the create statements code the table is created from zero, instead of using the schema of the existing table.

Associated Rollbar: https://rollbar.com/Populate/gobierto/items/3649/

ferblape commented 4 years ago

Issue description updated, you can grab the data from the jenkins server or ask it to me.