Open scott-yj-yang opened 1 year ago
Hi @scott-yj-yang ,
Thanks for the feedback!
To address the issues with our background and also the size of our data, we decided to focus on Moscow. Moscow is the capital of Russia and is a hot real estate market. It also cuts down our sample to about 1/10th the size of the original dataset.
We also write about sampling a random subset from the Moscow data, since this would avoid breaking assumptions for models like OLS while reducing our computation time complexity.
Do you think these updates are a good way to address the problems you mentioned?
In addition, we've performed EDA and are starting to implement models. Regarding this, we have two main qs:
Thank you in advance for your advice!
Background +0.75; Data +0.75
oh so sorry I didn't see your comment last week. Yes, a smaller dataset sampling from Moscow's data definitely sounds much better. Feel free to try any techniques to accelerate training and testing!
Project Proposal Feedback
Score (out of 9)
Score = 9
Feedback:
Rubric
Scoring: Out of 9 points
If students address the detailed feedback in a future checkpoint, they will earn these points back.
Comments
In the background part you discuss about how real estate market is largely influenced by economy and government policies, but neither of these two are used in the solution you proposed. Therefore I suggest not to put your emphasis on these two factors in your background section, but discuss more about why you are investigating only demographic information. However, you can mention the impact of these two factors in result analysis.
Another concern is the dataset you pick is a bit too huge. Usually more data yield more accurate and meaningful result, but as the dataset grows extremely large it imposes a lot of technical challenges for even very simple model. Maybe you can consider reducing the scope of the problem, say focusing on only real estate in a city or even a district (but explain why you pick this or that city/district)?