Closed karthikeyann12 closed 7 years ago
In several cases;
services_performed = (diagnostic + other + therapeutic)
If the aforementioned calculation is not true, then, that specific case could be an anomaly. You might also want to consider the corresponding percentages to identify an anomaly.
So, I'm guessing the NAs in those columns actually have some meaning.
The specified doc_id has not performed any diagnostic service in the (city,state) combination, but, therapeutic/other services were rendered by the doc_id.
You could replace NAs by '0', if the record isn't an anomaly.
My 2 cents...
@karthikeyann12 answer to replacing NAs in different services column is in total services column and how it is used to calculate therapeutic,diag and other service column. @eashwarsiddharth correct.
@Rajhan There are certain anomalies in these calculations (total drug cost/total claim count), removing these anomaly will result in elimination of large sets on data.
Should I treat the NA in diagnostic/other/therapeutic columns with average/median, If I treat them then is it okay to treat subsequent percentage columns?