RHoMIS / rhomis-R-package

An R package for essential processing of data collected with RHoMIS
GNU General Public License v3.0
4 stars 4 forks source link

Issues in livestock_tlu_calculations() #32

Open rfrelat opened 2 years ago

rfrelat commented 2 years ago

I spotted 5 issues with the TLU calculations:

  1. rabbits, llama, and doves are missing from livestock_tlu For instance, id_hh b61eb6de5f1a63448e54d8945e1c3f2d : rabbits missing (+ 1500.02) id_hh ce410e673dfd1e8d510d9a970e6428f9 : llama missing (+ 40.7) id_hh e5d2a4187a61f6dca0fa29115997d417: doves missing (+1*0.1)

  2. The conversion of 'ducks' is 'ducks' in the livestock_name.csv, there is a column 'livestock_heads_ducks', but in livestock_tlu, the correct spelling is 'duck'. Hence 'ducks' are not counted in tlu calculation. The easiest might be to correct it in the file livestock_name.csv.

  3. the head numbers in some livestock categories are logical (only TRUE or FALSE). This is the case for livestock_heads_oxen, livestock_heads_guinea_pigs, livestock_heads_buffalo, livestock_heads_ducks, livestock_heads_camel and livestock_heads_dogs. These categories might need to be removed from TLU calculation since TRUE will be wrongly counted as 1, instead of the real number of livestock heads.

  4. When livestock_other1 == livestock_other2, only one number is reported in clean_tlu_column_names() The only error I found is with hh_id 15812490d853ba30fea575e374002a6c with one donkeys_horses reported as livestock_other1 and one donkeys_horses reported as livestock_other2 (is it a repeated information?). After clean_tlu_column_names(), in the column donkeys_horses, only 1 donkeys_horses is listed (instead of 2, I guess). From what I can see, it is made in the line: position_to_add <- which(colnames(data) %in% c("livestock_other1", "livestock_other2", "livestock_other3")) %>% max() Where max() could be replaced by sum(); if we want to combine multiple information (if these are not duplicated information).

  5. It would be good to automatically check that all livestock_heads are positive or 0. The only error I found is id_hh 9c440cb6cf067f6c22ee4efce99ee965 with livestock_heads_cattle=-99.

l-gorman commented 2 years ago

Thanks @rfrelat, I have added you as a contributor to the package for all of these helpful reports, I hope that's okay but I thought it would be important to acknowledge this!

These are really helpful points, I will respond to these in order:

  1. Yes, currently we rely on the package to provide conversion factors. The problem with this, is that when new animals come up in the "other1", "other2"... columns, they will not be considered in the TLU calculations. In this case, the TLU conversions need to be extracted into a table and verified the user. I am currently working this on this branch, which I plan to merge by the end of the month.

  2. Agreed, I'll stick with duck then! Well spotted!

  3. Okay I will make sure to type check the result. I am curious as to why any of these come through as True or False in the first place. If I find time I will do a little digging

  4. Okay that is interesting, I am not sure what we should do in this instance, as we should not really have duplicates in other1 and other2. I think we should ask @JimHam and MvW what to do in this instance

  5. Okay we can add this type check. Actually -99 should be "-999", this is the response that enumerators are supposed to put when the respondent does not know. I will do a general check in the data also for "-9", "-99", and "-999". As there is no point in the survey where a minus should be expected anyways. Again would be good for @JimHam and MvW to confirm