africanmathsinitiative / R-Instat-Help

The latest uncompiled (.hnd) and compiled (.cmd) versions of the help file for the statistics software package R-Instat
https://chuffed.org/project/africandatainitiative
GNU General Public License v3.0
2 stars 5 forks source link

The World Bank procurement data - 1 #10

Open rdstern opened 6 years ago

rdstern commented 6 years ago
  1. From loading R-Instat choose File > Open from Library.

  2. There press the button to Load from Instat Collection and then the Browse button.

  3. You see the Import Dataset dialogue with Import from Library in front.

  4. Choose the Procurement directory and then the file called WorldBank.RDS. It is about 30 mbytes.

  5. Click Ok.

  6. It takes some time to import and does at least include an apologetic message.

  7. Once imported you see it has 3 sheets or data frames. The one that is open has information on 185 thousand World Bank contracts. R-Instat is just giving a window onto part of this data frame. It doesn't show the whole data.

  8. We now examine a few columns (or variables) in turn. The first is the World Bank project ID. the projects each have a number of contracts and we first look in a bit more detail at the 3rd variable, called ca_year. At the top, right-click when in this name field and choose the item Levels/Labels.

  9. This opens a dialogue where we see there were only a few contracts in the early years. We might want to start our analyses from 2003.

  10. While this dialogue is open we look also at the countries. Scrolling down we see that Kenya had 1088 contracts.

  11. Close this dialogue. To learn more about the variables click on the toolbar button with an i (for information). This opens a third window. Make it larger and also make the second field, called label wider.

  12. Each row here shows information about a column of the data. This includes a longer and more descriptive name for each variable. For example the column ca-year is the year of the signature date for a contract.

  13. Scrolling to the bottom we see that some of the last columns refer to tax havens.

  14. We have started to understand the columns of data. It isn't necessary to understand them all, before starting the analysis, because the main ones have been defined, as we see later.

  15. Before closing the metadata window we look at one of the other data sets, the third, which is at the country-level.

  16. This has just 170 rows, one for each country.

  17. We also look at the meta-data for this data frame.

  18. We are aiming to draw a map. The 4th column gives the country names, ready for a map. As an example we will also look at the index of corruption by Transparency international in 2013. That is called cpi2013.

  19. Now we first close the metadata window, either by clicking again on the icon, or using the arrow to reset to the default positions.

  20. You may already have a menu called Procurement in your version of R-Instat. If not, use the View menu and tick on the option shown.

  21. On this new menu use the dialogue to Map Country Values as we show here. (country_name_map, by cpi2013). You can just click OK here to give a world map, but we choose an option to show Africa. Then here is the resulting map.

  22. This presentation has been to introduce an example of the procurement data. The next ones will continue with the analyses.

.

trottingafrican commented 6 years ago

We now examine a few columns (or variables) in turn. The first is the World Bank project ID. the projects each have a number of contracts and we first look in a bit more detail at the 3rd variable, called ca_year. At the top, right-click when in this name field and choose the item Levels/Labels. r-instat The Levels/labels item is disable so we can't be able to proceed with the preceding steps.

mmumbo commented 6 years ago
  1. Once imported you see it has 3 sheets or data frames. The one that is open has information on 185 thousand World Bank contracts. R-Instat is just giving a window onto part of this data frame. It doesn't show the whole data. The data frame has only single data sheet called WorldBank, is this the same on your side?
rdstern commented 6 years ago

It is enabled in my version. Perhaps we did that after version 0.4.6? For now, you can use the Factor > Levels/Labels dialogue with that column to get to the same place.

One extra thing to mention is that Danny will change the data you are using. You can do the change yourself for now, and it is to sort the data on the ID column. That's what it will be in the video version. He will change the data set today.

rdstern commented 6 years ago

No. Perhaps you opened the wrong file? There is also WorldBank_old.RDS?

It should have the data sheet with 185,000+ rows, then a second sheet with Year by Country and a third with just Country.

dannyparsons commented 6 years ago

You will need to be using the latest version through Visual Studio to use everything you need to make the videos.

mmumbo commented 6 years ago

Hello and welcome, in this R-Instat video presentation we are going to introduce you to an example of the procurement data.

  1. After loading R-Instat choose File > Open from Library.
  2. Press the button to Load from Instat Collection, then click on the Browse button.
  3. You see the Import Dataset dialogue with Import from Library in front.
  4. Choose the Procurement directory and then the file called WorldBank.RDS.
  5. Click open then Ok
  6. It takes some time to import and it might include an apologetic message.
  7. Once imported you see it has 3 sheets or data frames. The one that is open has information on 185 thousand World Bank contracts. R-Instat is just giving a window onto part of this data frame. It doesn't show the whole data.
  8. We now examine a few columns (or variables) in turn. The first is the World Bank project ID. the projects each have a number of contracts and we first look in a bit more detail at the 3rd variable, called ca_year. At the top, right-click when in this name field and choose the item Levels/Labels.
  9. This opens a dialogue where we see there were only a few contracts in the early years. We might want to start our analyses from 2003.
  10. While this dialogue is open we look also at the countries. Scrolling down we see that Kenya had 1088 contracts.
  11. Close this dialogue. To learn more about the variables click on the toolbar button with an i (for information). This opens a third window. Make it larger and also make the first and second fields, called Name and label respectively wider.
  12. Each row here shows information about a column of the data. This includes a longer and more descriptive name for each variable. For example the column ca_year is the year of the signature date for a contract.
  13. Scrolling to the bottom we see that some of the last columns refer to tax havens.
  14. We have started to understand the columns of data. It isn't necessary to understand them all, before starting the analysis, because the main ones have been defined, as we shall see later.
  15. Before closing the metadata window we look at one of the other data sets, the third, which is at the country-level.
  16. This has just 170 rows, one for each country.
  17. We also look at the meta-data for this data frame.
  18. We are aiming to draw a map. The 4th column gives the country names, ready for a map. As an example we will also look at the index of corruption by Transparency international in 2013. That is called cpi2013.
  19. Now we first close the metadata window, either by clicking again on the icon, or using the arrow to reset to the default positions.
  20. You may already have a menu called Procurement in your version of R-Instat. If not, use the View menu and tick on the option shown.
  21. From Procurement menu, click on Mapping then select Map Country Values. From the dialog box select country_name_map and click add, then select cpi2013 and click add. You can just click OK here to give a world map, but we choose an option to show Africa. Click on map options then select Choose regions. select Africa, click return then click Ok. Here is the resulting map.
  22. The next ones will continue with the analyses.
rdstern commented 6 years ago

Thanks. I will get Danny also to go over this. How long was the video - roughly?

mmumbo commented 6 years ago

We managed to get the latest version with the updated data sets. Above is the updated script that will take between 4-5 mins from our timing