openspending / cameroon.openspending.org

Website for "Cameroon Budget Inquirer"
http://cameroon.openspending.org/
7 stars 10 forks source link

Investment data should be updateable just by changing the dataset #44

Open vitorbaptista opened 10 years ago

vitorbaptista commented 10 years ago

Basically, I want to add another year by just going on OpenSpending and updating the data.

We already get the regions and departments from the dataset, so the only thing missing AFAIK is adding a new year. To make that work we need not only the investment data, but as we're showing per capita investments, we also need population and area. It comes currently from https://github.com/openspending/cameroon.openspending.org/blob/gh-pages/data/population.json

This is different. As it's not financial data, it doesn't make much sense to create a cm-population dataset on OpenSpending and upload it there (because we need an amount column). We could make it work by using population as amount, for example, but it's hacky.

Another option is to upload that to the DataHub, and use the DataStore to query it.

vitorbaptista commented 10 years ago

The data shown at regional investment page is a subset of the data at the national level, its advantage is just to have a map zoomed in. As it isn't showing new data, we can focus only on the national level as a first step. Afterwards, if we think it's useful, we can change the regional level to use the same dataset as the national.

About the population issue, @anderspeders and I agreed that it changes so infrequently that it's not worth it to move it out of the JSON file (into the DataHub or something).

vitorbaptista commented 10 years ago

I still need to update the text at the "About" tab in the national investment page where it links to the source data in "raw form" and "cleaned and processed". The raw data might not be a single file anymore, so we could simply link to a dataset on the datahub or somewhere else. For the cleaned and processed data, we can link to http://openspending.org/cm-pib.csv

vitorbaptista commented 10 years ago

Another advantage of having separate national and regional investment pages is that the regions can update their investment datasets separatedly. For example, there is data for investments in 2010 at the North-West region, but in the national dataset there's only data for 2011. Even though the datasets' models are the same, I can't simply get that data for 2010 and add it into the national budget, because if I did so, the user will see the option to check the national data for 2010, which will be wrong, as there's only data for the North-West region.

vitorbaptista commented 10 years ago

To solve the problem mentioned above, I could display at the national investments page only the years that have data for x regions. Right now we just have data for 2011, and it has 11 regions: the 10 geographical regions and (General or External). So this could be the default: if any year has 11 regions, I consider it's complete and display it.

Other option would be to have another dataset, cm-regional-investments, with just the (as the name suggests) regional investments. The problem with this approach is that when they're adding a new year, they'll have to upload the regions both to cm-regional-investments and, if they have the data for all Cameroon, to cm-investments.

vitorbaptista commented 10 years ago

This is blocked until we define the cm-investments data model on #47