Currently, we have 32 columns that have at least one empty cell. We need to figure out a good method to fill them in.
The following 32/135 column(s) have NaN data (23.70% of columns)
high_quality_manual 1444
label_severity_min 1
label_severity_max 1
label_severity_mean 1
label_severity_sd 1
curb_ramp_severity_min 99
curb_ramp_severity_max 99
curb_ramp_severity_mean 99
curb_ramp_severity_sd 152
missing_curb_ramp_severity_min 214
missing_curb_ramp_severity_max 214
missing_curb_ramp_severity_mean 214
missing_curb_ramp_severity_sd 326
obstacle_severity_min 345
obstacle_severity_max 345
obstacle_severity_mean 345
obstacle_severity_sd 512
surface_problem_severity_min 398
surface_problem_severity_max 398
surface_problem_severity_mean 398
surface_problem_severity_sd 567
no_sidewalk_severity_min 639
no_sidewalk_severity_max 639
no_sidewalk_severity_mean 639
no_sidewalk_severity_sd 790
tutorial_minutes 251
tutorial_error_count 251
accuracy 2
curb_ramp_accuracy 106
missing_curb_ramp_accuracy 229
obstacle_accuracy 404
surface_problem_accuracy 425
dtype: int64
Empty cells in these 32 column(s) will be replaced by the mean of their respective columns
<ipython-input-39-6f8a379c8883>:95: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError. Select only valid columns before calling the reduction.
df_users.fillna(df_users.mean(), inplace=True)
Currently, we have 32 columns that have at least one empty cell. We need to figure out a good method to fill them in.