sebastianbarfort / sds

Social Data Science, course at University of Copenhagen
http://sebastianbarfort.github.io/sds/
12 stars 17 forks source link

Group 17: Assignment 3 #61

Closed CarolineMarkMortensen closed 8 years ago

CarolineMarkMortensen commented 8 years ago

Assignment 3 Our topic to our exam project is housing for sale. We want to scrape data from www.boligsiden.dk where we get the information about the houses and apartment there is for sale right now. In the data set we want to know the price, the size, how many rooms, days on market(DOM) and where the location is. If possible we would like to scrape the archive function on www.boligsiden.dk for houses and apartments that have been on sale. We think this topic about the housing market is very interesting right now because we have a much divided housing market. In the big cities we observe rising prices and very low DOM and in the rural areas we see houses with a low price and a relative high DOM. We know the topic studied before and is continuously studied, but as everybody needs housing and the housing market changes all the time it is always interesting to look into patterns in housing trends. Normally when looking at house prices in Denmark the prediction is made on houses that have already been sold and that delays the house price index with approximately 6 months. We look at house prices in real time before the houses are sold. We want to look if there is a correlation between the total number of apartments for sale and the adults between 18-90 years? A possible aspect could be to look into how many people that live alone in the Copenhagen area and put it up to the data from KK about how many apartments there are in KBH. The new trend in housing is that more people live alone compared to 20 and 50 years ago. It reflects the the change in our society and that the family is no longer the center of attention. We could look at historic data for families in Denmark using statistikbanken.dk We want to see in which area the houses/apartments are sold fastest, and look what kind of houses/apartment that has a high DOM. We would like to predict the DOM for different apartments with certain criterias as location, number of rooms and price per square meter. When looking at DOM we would like to use the archive function on boligsiden.dk If possible we would like to predict house prices for the bigger cities depending on size, number of rooms, location. The main problem with scraping data from boligsiden.dk is that we can only see what the house is listed for and not what it is sold for, so we will have some bias on the actual price of housing. We expect to look into the differences in prices in all of Denmark and the development in family types and single household. We also want to predict DOM and house prices in two big cities and compare the differences.