devinit / digital-platform

PostgreSQL/analyst → MongoDB → Development Data Hub
http://data.devinit.org:8888/#!/ & http://data.devinit.org/#!/
3 stars 12 forks source link

New DDW DB uploaded to server + some background info #244

Closed dw8547 closed 7 years ago

dw8547 commented 8 years ago

ddw_20160127084339 is now on 213.168.251.124 & modified the relevant user permissions to match those for ddw.

Please let me know when I can drop ddw and rename ddw_20160127084339.

The DH uses about 285 different data series/files (including those belonging to the category reference) each of which lives here: https://github.com/devinit/digital-platform/tree/master/country-year

The goal is to automate the generation of all of those data series. At the moment & in the past, the analysts put these data series/files together by hand. They are sourced from a number of external data providers, the major ones being OECD, WB & IMF.

We have currently automated 77 out of 285. We read in the remaining data series into tables from the flat files on: https://github.com/devinit/digital-platform/tree/master/country-year. The analysts are currently working on updating the data series/files that have not been automated. They are not ready yet, but when they I, I will repopulate the relevant tables. They haven't provided a date for when I can do this.

So, ideally, we would like the 'ddw' DB to only have two major schemas, 'dimension' & 'fact', from which the DH would read files. Because we haven't automated all of the series, the set up is a bit different. It is:

Name Owner
dac_country_deflator x
data x
data_series x
deflator x
dimension x
fact x
non_dac_country_deflator x
public postgres
reference x
series x

The DH will need to read the data from:

Name Owner
data_series x
dimension x
fact x
reference x

All the automated data series are in 'fact'. All the data series that have not been automated are in 'data_series'. All the automated reference files are in 'dimension'. All the reference files that have not been automated are in 'reference'.

We have kept the names of all the data series/files with the exception of changing the '-' to '_' in the names. Hence DH's 'oda.csv' is now ddw table oda.

A couple of things to look out for:

dw8547 commented 8 years ago

@xriss & @notshi, I've just noticed that the price conversion function leaves out some of the columns for 'oda'. I will fix it & push to 213.168.251.124 again when ready. My apologies! Once I've done that I'll get on with checking the files.

dw8547 commented 8 years ago

@xriss & @notshi ddw_20160127143241 is now on 213.168.251.124 with the corrected price conversion function & oda_2012 table.

notshi commented 8 years ago

Thanks, @dw8547. Duly noted.

xriss commented 8 years ago

You can rename and drop whenever you want, the only bit that will cause any problems is that I need a readonly user.

currently I've setup a user called, "readonly" with appropriate access to ddw.

However if you add a new database or schema or table then you will need to make sure that the following have been granted for each new one.

GRANT CONNECT ON DATABASE dbname TO readonly;

GRANT USAGE ON SCHEMA schema TO readonly; GRANT SELECT ON ALL SEQUENCES IN SCHEMA schema TO readonly; GRANT SELECT ON ALL TABLES IN SCHEMA schema TO readonly;

dw8547 commented 8 years ago

@xriss readonly now has the above permissions on ddw_20160127143241.

@notshi I can't drop ddw (and rename ddw_20160127143241)

ERROR: database "ddw" is being accessed by other users DETAIL: There is 1 other session using the database.

postgres=# SELECT usesysid, usename, datname FROM pg_stat_activity WHERE usename != 'x';

usesysid usename datname
16384 shi ddw
notshi commented 8 years ago

Thanks, @dw8547. I've logged out now so should be ok to rename.

dw8547 commented 8 years ago

@notshi, dropped and renamed.

dw8547 commented 7 years ago

Dead issue.