Repository for Kristina Wright and Daniel Hadley group project for STAT 547M
This repository houses Group 09's project for STAT 547M taken in Term 2 of the 2019-2020 academic here.
Our project uses an Airbnb dataset to try and find significant factors to explain the listing prices (per night) in Barcelona, Spain.
The final report is created by meeting milestones which are linked below.
As milestones are met, files are placed into the appropriate subfolders.
*.Rmd
files used to create reports.R
scripts (*.r
) that are called when rendering the project.Milestone | Due Date :date: | Report |
---|---|---|
01 | February 29, 2020 | milestone01 |
02 | March 7, 2020 | milestone02 |
03 | March 14, 2020 | html and pdf |
Clone this repo.
Ensure the following R
packages are installed:
tidyverse
here
docopt
knitr
DT
gridExtra
corrplot
glue
scales
broom
Option 1: Run the following scripts (in order) in terminal from the main repo directory with the specified arguments:
a) Load data
Rscript scripts/load.R --data_url=https://raw.githubusercontent.com/STAT547-UBC-2019-20/data_sets/master/listings-Barcelona.csv
b) Clean data
Rscript scripts/process.R --path_raw=data/raw_listings.csv --path_clean=data/clean_listings.csv
c) Exploratory data analysis
Rscript scripts/EDA.R --path_clean=data/clean_listings.csv --path_image=images/
d) Linear Regression
Rscript scripts/lm.R --path_data=data/clean_listings.csv
e) Knit final report
Rscript scripts/knit.R --final_report="docs/final_report.Rmd"
Option 2: Run make in terminal from the main repo directory to run all individual scripts above:
a) Dependency
Ensure make
is installed
b) Run all scripts and reproduce analysis
make all
c) Delete all output from scripts
make clean
This app will display a map of Barcelona with points that show the location of each listing that are colour coded by price. The exact price, longitude, and latitude will be displayed when hovering over the points. A violin plot of price vs. district and price vs. room type will additionally be displayed. The price ranges plotted on the map and two violin plots can be adjusted using a two way slider. Specific districts (ie. areas of Barcelona) and room types can filtered using two dropdown lists for replotting of the map and two violin plots.
Asuna is looking to purchase a property in Barcelona to put up for rental on Airbnb. She wants to know what type of property she should purchase to make the most money by exploring a dataset that identifies the locations and room types that make the most money. While surfing the internet, Asuna finds "Barcelona Airbnb Price App." She is able to see what is the most popular pricing for various districts in Barcelona and room types. She thinks the most popular pricings will most likely have good booking rates and chooses a price slightly higher than the mean pricing in a higher priced district and room type listing to estimate her earnings. Using this useful information, she can now make an informed decision on what type of property to purchase and what her expected net earnings are.