Closed peter-callahan closed 2 years ago
Are the enough data points to perform a prediction in the geographical area of interest?
Yes, we have enough data to begin, and can collect more as needed.
What data cleaning tasks are necessary?
Data conversions, replacing NANs and strings, and strategically dropping some categories/columns. Dropping foward looking values is important to avoiding leakage.
Is missing data a problem for any particular columns?
Yes, key missing values for sqft, beds, bathrooms pose a risk. Will need to be cautious with filling those.
Do any basic patterns emerge that increase/decrease trust in the dataset?
None that cannot be handled with data cleaning or by expanding the dataset.
Questions that should be addressed during EDA: