Closed zander-prinsloo closed 7 months ago
Task linked: CU-86869wcrp {wbpip} maintenance
Task linked: CU-8686teb15 refactor for consistency
Hi @tonyfujs,
Even though this PR includes many changes, none affect the regular behavior of wbpip
. As you can see below, all tests, including several new ones, are passing. The main purposes of these changes are:
pipster
packagemd_compute_poverty_stats()
and gd_compute_poverty_stats_lq()
to allow vectorization (not fully completed yet), and allow flexibility for additional functions. Also, we created several subfunctions to be used inside these two and, more importantly, to be used independently from them. That is, the user can call the new subfunctions (e.g., md_compute_headcount') without having to call
md_compute_poverty_stats. Since these new subfunctions are exposed, some checks are needed and some efficiency is lost. I tested the performance of the new and old
mdand
gd` functions and the difference is negligible. For md I used 20 Million obs and you can see that though the new function is a little slower, the difference in milliseconds is irrelevant.Please, let me know if you have any question.
md_compute_poverty_stats()
20M obsgd_compute_poverty_stats_lq()
Hello - I have not done a full review, but made some comments. Please take a look, and let me know what you think. Thanks!
Some comments are about things that happen across the PR (I haven't highlighted every line of code were this happens), for instance the use of
cli
in low-level functions and referencing package withpackage_name::function_name
Hi @tonyfujs, regarding cli
I think there is no cost including it and the gains in clarity are great. yet, we can remove them. I understand what you mean about low-level functions, but I think we should meet a middle ground, especially for conditions. I can compromise removing the cli
calls, but I think the conditions in some of the functions should remain.
Regarding referencing package, I'll make sure reference properly, but be aware some functions from cli
and all the functions from collapse
have been imported, so there is no need to reference them. This is an important point. the collapse
package brings a lot of benefits that can be gradually implemented in wbpip to speed up processes, that is why I imported the whole thing.
Hi @tonyfujs,
You can check the PR again. I removed all the informative cli
messages that we introduced this round, but I left the ones that were already present in the package. All the tests are passing. Please, feel free to push back as much as needed.
Thanks.
Hi @tonyfujs ,
this PR is ready. All checks pass
Specific points:
gd_lq_key_values
- new function, used in places such asgd_estimate_dist_stats
gd_compute_pov_*_lb
functionsgd_compute_poverty_stats_lq_replacement.R
and all associated functionsmd_compute_poverty_stats_replacement.R
and all associated functionsgd_utils.R