Closed jyk closed 1 year ago
@jyk, Thank you for your suggestions.
Currently diagnose() does not support group_by().
describe(), correlate(), and normality() for EDA support group_by, but operations in the Data Diagnosis task do not support group_by(). This is because it is designed for the purpose of diagnosing data when you first encounter it.
By the way, I think your attempt is meaningful. This is because the quality of the data can be problematic in certain categories.
I will modify diagnose() to work with group_by(). just give me some time
Thanks. Yes, I would like to use group_by() for different time periods (like in weekly and monthly basis) in order to track the data quality issues in time during scoring of the models
@jyk,
I implemented your suggestions on github development version 0.6.2.9000.
Functions that support group_by() are as follows.:
diagnose_outlier()
Thanks
Thank You very much! Now I am fine.
Dear Lorenzo Fabbri,
Pass arguments to the group_by() function with the across() function, as in the following example.
grouping_var <- "death_event"
dlookr::heartfailure |>
group_by(across(all_of(grouping_var))) |>
dlookr::diagnose_category()
Regards, choonghyun
-----Original Message----- From: "Lorenzo @.> To: @.>; Cc: "Choonghyun @.>; @.>; Sent: 2023-06-03 (토) 00:09:47 (GMT+09:00) Subject: Re: [choonghyunryu/dlookr] add group_by() functionality (Issue #90)
I wrote a function which takes as input a string (in this specific case "cohort") representing a factor to pass to dplyr::group_by, which is called before diagnose_category: dlookr::diagnose_category(dat |> dplyr::group_by({{ grouping_var }})) It does not produce any error but in the resulting tibble, the column cohort contains only the value cohort, rather than its levels. I guess it is related to the use of {{, but I have not found a solution. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were assigned.Message ID: @.***>
Hi, maybe I am missing something, but I can not use currently group_by() and then diagnose() etc. Is it possible please to add this functionality ("diagnose for groups") ?