GEMINI-Medicine / Rgemini

A custom R package that provides a variety of functions to perform data analyses with GEMINI data
https://gemini-medicine.github.io/Rgemini/
Other
3 stars 0 forks source link

28 add function to check missingness #49

Closed guoyi-smh closed 9 months ago

guoyi-smh commented 9 months ago

Adds n_missing function to check missingness. n_missing works on both vector and data.frame or data.table and allows user to specify either to return number of missingness or index. mi2 is kept as an alias of n_missing.

vaakesan-SMH commented 9 months ago

Just confirming @guoyi-smh that this function now supersedes GEMINIpkg::count_missing()? If so we will need to submit a merge request for GEMINIpkg that deprecates that function.

guoyi-smh commented 9 months ago

Just confirming @guoyi-smh that this function now supersedes GEMINIpkg::count_missing()? If so we will need to submit a merge request for GEMINIpkg that deprecates that function.

Yes I think it is safe to deprecate GEMINIpkg::count_missing().

vaakesan-SMH commented 9 months ago

Just confirming @guoyi-smh that this function now supersedes GEMINIpkg::count_missing()? If so we will need to submit a merge request for GEMINIpkg that deprecates that function.

Yes I think it is safe to deprecate GEMINIpkg::count_missing().

I think we will also need to do the same for mi2(). Could you create a merge request on GEMINIpkg which deprecates these two and assign me as reviewer to that as well?

guoyi-smh commented 9 months ago

Looks good! Thanks for the implementation @guoyi-smh.

Curious how we handle NULL, because it can't be added to the na_strings vector. We don't necessarily have to make any changes now, just wondering if we might need to extend the functionality at some point.

Good point. I did think of this but ended up not implementing it in this version. So far I have not encountered any NULL in our database. In lab table there are some "NULL" as strings so the function works on them if the user specify na_strings = "NULL". And it takes some hack to create a dataframe/data.table with NULL. So I think we are good for now. But definitely it will be great to add in the future as an improvement.

vaakesan-SMH commented 9 months ago

Good point. I did think of this but ended up not implementing it in this version. So far I have not encountered any NULL in our database. In lab table there are some "NULL" as strings so the function works on them if the user specify na_strings = "NULL". And it takes some hack to create a dataframe/data.table with NULL. So I think we are good for now. But definitely it will be great to add in the future as an improvement.

Makes sense to me. Feel free to go ahead and resolve merge conflicts, merge this branch (squash all commits when merging), and then merge your branch on GEMINIpkg.

Thanks @guoyi-smh !