Open VietAnhNgg opened 3 days ago
Hi, That is a formula for checking the outliers. Remember the Empirical Rule? We will check the values that are outside the 3 of standard deviation. In contrast, the other 99.7% of data within this scope needs to be kept. Here's code : q_low = df['SalePrice'].quantile(0.0015) q_high = df['SalePrice'].quantile(0.9985) print(q_low, q_high)
import numpy as np Q1 = np.percentile(df['SalePrice'], 25) Q3 = np.percentile(df['SalePrice'], 75) interquartile = Q3 - Q1 -- print(Q1, Q3, interquartile) q_low = Q1 - 1.5 interquartile q_high = Q3 + 1.5 interquartile print(q_low, q_high) Do y got it?
Come here and explain this for me : Can you tell me the concept and analysis of this chart? I need your help