devinit / datahub

Datahub v2
http://data.devinit.org
15 stars 3 forks source link

Unbundling aid update #462

Open Duncan-Knox opened 5 years ago

Duncan-Knox commented 5 years ago

Data series needed for unbundling aid update:

This involves update progress on tables that drive the Unbundling Aid section of the website

List of Tables (included in last year's issue)

Dimension series needed for production of fact.oda_constant_new:

Other

Please edit issue if needed @Napho @akmiller01

Duncan-Knox commented 5 years ago

Added to backlog for carry over into 2019.

akmiller01 commented 5 years ago

Once null characters are removed from raw CRS, it can be read into R without error like so:

r = readBin(txt, raw(), file.info(txt)$size)
r[r==as.raw(0)] = as.raw(0x20) ## replace with 0x20 = <space>
writeBin(r, paste0("crs_cleanup/",basename(txt)) )
tmp = fread(paste0("crs_cleanup/",basename(txt)),sep="|")

After which, LATIN1 encoding is needed to import it into SQL.

DELIMITER ‘,’ ENCODING ‘LATIN1’ CSV HEADER;
COPY 3764422