wingyiuc / dsw-project

1 stars 0 forks source link

Redundant one-hot encoding #11

Closed wingyiuc closed 6 months ago

wingyiuc commented 6 months ago

one_hot_encodercleaning_fee_False bool one_hot_encodercleaning_fee_True bool one_hot_encoderhost_has_profile_pic_f bool one_hot_encoderhost_has_profile_pic_t bool one_hot_encoderhost_has_profile_pic_nan bool one_hot_encoderhost_identity_verified_f bool one_hot_encoderhost_identity_verified_t bool one_hot_encoder__host_identity_verified_nan bool one_hot_encoderinstant_bookable_f bool one_hot_encoder__instant_bookable_t bool

These are boolean columns. Applying one-hot encoding would create redundant duplicated columns.

wingyiuc commented 6 months ago

Also, the coefficients for these binary variables seems wrong. I thought they should be positive. image

vanessadada commented 6 months ago

solved in the latest pull request