IDEMSInternational / R-Instat

A statistics software package powered by R
http://r-instat.org/
GNU General Public License v3.0
38 stars 102 forks source link

ILRI biometrics training resource #3738

Open rdstern opened 7 years ago

rdstern commented 7 years ago

This is here https://www.ilri.org/biometrics/default.htm.

It can be downloaded, so it would be easy for us to supply it.

There are 17 case studies and all the data sets are included. When downloaded it is just over 500 mbytes zipped. (About 670Mbytes unzipped.)

The data sets all read easily into R-Instat (at least all I have tried, which is about half the case studies.) This also includes good, though old SSC materials. Some links are not working but it should be easy to get them all to work.

  1. The data files in Excel each have a "partner" in Word. This usually contains a brief overall description of the data file (data frame), plus an explanation of each variable. There are often 2 or 3 Excel files for a case study. 2.) I am not sure how to import from Excel including the variable labels? I wonder if it would be worth importing into R-Instat, then adding the labels, and then saving all as a single R-Instat object.
  2. Then we could leave the ILRI resource totally separate, but make using the data sets for each case study really easy. We have an ILRI Biometrics directory. Then either 17 R-Instat objects, or 17 directories with one or two files in each.
  3. Then a separate task could be either to sort out the links (or get ILRI to do that. Given the closure of SSC this could be a neat way of keeping access also to these resources.
  4. I don't think this is much work, and it fits - as does CAST - with ADI, i.e. a broad view of resources.

So I suggest we (at least start) for version 4.1.

The one item I am unsure about is how easily R-Instat could cope with quite a long description of a data frame - say 2 or 3 sentences. That is perhaps something to clarify with David and Danny - so I am starting with them as named individuals!

dannyparsons commented 7 years ago

I don't see any issue with this. It would be good to have something on this in for 0.4.1 since it's for a biometry conference

rdstern commented 7 years ago

CS1.zip

Here is the first file I would like. This could go into the library into an ILRI directory. I hope someone could work on preparing an RDS file for each of the case studies. One problem is that I would like the variable labels to become part of this information. In the Column metadata pasting is not (yet) permitted. I wonder if that can change.

Given the topic it would be great to get this facility reasonably started by the SUSAN meeting on 20 August.

trottingafrican commented 7 years ago

We are beginning this work today. What are the specific case studies have you worked on? @rdstern

mmumbo commented 7 years ago

@rdstern

  1. Would you mind telling us the files you have worked upon because we nave managed to get the whole data with 17 case studies?
  2. How are we suppose to deal with the excel data files with more than two data sheets?
rdstern commented 7 years ago

There seem to be 2 sorts of case studies in terms of the data files.

  1. Those with one (or more) Excel files, but each file is simple (i.e. one sheet). a) Those Excel files also seem to have a companion Word file with a description of each column and (sometimes) an overall description. b) The column descriptions can each be pasted into the column metadata. It is sometimes a bit slow, because you can only paste one label at a time. You may also edit the label at the same time, if it is unnecessarily wordy. c) Then the set of Excel files can become a single rds (Instat data) file. d) In those cases the name of the rds file can indicate the case study and can go into the ILRI Biomentics directory.

  2. There are those where the Excel files are more complicated. For now (in those instatnces) I would be inclined (initially) to make each of those a sub-directory and then leave the files as you find them, i.e. for now just copy them over into that directory.

mmumbo commented 7 years ago
  1. The pasting option should be enabled to facilitate the easy transfer of metadata labels. when can this be done? @dannyparsons @volloholic since at the moment we only have an option of typing which is time consuming.
rdstern commented 7 years ago

I had that problem. Danny showed that if you click in the label field so you can type into it, then you can paste into that field.
You can't paste multiple fields but that helps.

dannyparsons commented 7 years ago

Yes the single pasting is possible as Roger explains. You don't need to type it in. If it's not clear from the comment how to do that just ask me or Roger and we can show you how to do it. Pasting multiple would be great but we need to do more careful validation which is why it isn't implemented yet.

dannyparsons commented 7 years ago

@mmumbo @PatrykNjoroge this would be really useful for the next version we take to the workshop. Are you able to do more on this this week? Otherwise someone else could take over if you're not available?

mmumbo commented 7 years ago

We are back already working on them with Patrick for the better part of this week.

rdstern commented 7 years ago

That's great.

dannyparsons commented 7 years ago

How is this going? We would like at least some of these files in tomorrow's version.

mmumbo commented 7 years ago

We are done with all the files at the moment. We will share the files by tomorrow morning since my internet is not stable the moment.

mmumbo commented 7 years ago

We are done with all the files at the moment. We will share the files by tomorrow morning since my internet is not stable the moment.