ccmdesign-archives / open-government-research-exchange

1 stars 3 forks source link

Data not being pulled from Spreadsheet in the right format #38

Closed claudioccm closed 8 years ago

claudioccm commented 8 years ago

check this page for weird names - all-categories.html

Things like: AUSTRALIA-EUROPE AZERBAIJAN-LIBERIA MEXICO-CANADA-AUSTRALIA-SPAIN-TURKEY etc

Probably this is more an issue of data insconsistency. Let's see which is the best approach for this. Probably ask Ani to help with this manual consistency issues in the spreadsheet.

ghost commented 8 years ago
screen shot 2016-04-22 at 11 18 43 am screen shot 2016-04-22 at 11 17 41 am

it is present both in the json subsets and papers.json, so it is coming from the csv conversion like this.

in the spreadsheet, it is a CSV list,

screen shot 2016-04-22 at 11 22 26 am

but we slugify this field as part of the column mapping, as specified in the gulpfile here:

'Region' : { value : 'region', slugify : true },

so maybe this is not as much a data quality issue, as it is that they want to have multiple values here. i can add parsing by , to the mapping options here, if you'd like, but we just have to be aware this will now be an array so we have to update any templates using it accordingly.

want me to go ahead and parse this field to an array?

ghost commented 8 years ago

i think there might of been a few fields like this, that we actually wanted to be a parsed field. could u possibly just give me a quick list of all of these if you know off the top of your head and i will change them all at once?

EDIT : this is the same issue as https://github.com/ClaudioMendoncaDesign/open-government-research-exchange/issues/21 , i will do a fix for #21 which should also address this