Repository for Kristina Wright and Daniel Hadley group project (up to milestone 03) for STAT 547M For Milestone 04 thru Milestone 06, this repository has Daniel Hadley's project.
This repository houses Group 09's project for STAT 547M taken in Term 2 of the 2019-2020 academic here.
Our project uses an Airbnb dataset to try and find significant factors to explain the listing prices (per night) in Barcelona, Spain.
The final report is created by meeting milestones which are linked below.
As milestones are met, files are placed into the appropriate subfolders.
*.Rmd
files used to create reports.R
scripts (*.r
) that are called when rendering the project.Milestone | Due Date :date: | Report |
---|---|---|
01 | February 29, 2020 | milestone01 |
02 | March 7, 2020 | milestone02 |
03 | March 14, 2020 | html and pdf |
04 | March 21, 2020 | dashboard app |
05 | March 28, 2020 | dashboard proposal |
06 | April 4, 2020 | deployed app |
Clone this repo.
Ensure the following R
packages are installed:
tidyverse
here
docopt
knitr
DT
gridExtra
corrplot
glue
scales
broom
Option 1: Run the following scripts (in order) in terminal from the main repo directory with the specified arguments:
a) Load data
Rscript scripts/load.R --data_url=https://raw.githubusercontent.com/STAT547-UBC-2019-20/data_sets/master/listings-Barcelona.csv
b) Clean data
Rscript scripts/process.R --path_raw=data/raw_listings.csv --path_clean=data/clean_listings.csv
c) Exploratory data analysis
Rscript scripts/EDA.R --path_clean=data/clean_listings.csv --path_image=images/
d) Linear Regression
Rscript scripts/lm.R --path_data=data/clean_listings.csv
e) Knit final report
Rscript scripts/knit.R --final_report="docs/final_report.Rmd"
Option 2: Run make in terminal from the main repo directory to run all individual scripts above:
a) Dependency
Ensure make
is installed
b) Run all scripts and reproduce analysis
make all
c) Delete all output from scripts
make clean
This app will have features for data exploration and data analysis. Data exploration uses a density of Airbnb listing prices for Barcelona, Spain and allows the user to select a categorical variable to view a violin plot. The default density is for all listings. Grouping options include District, Room Type,and Minimum Night Stay.
For data analysis, users can filter the dataset using price, distance from city center, or minimum night stay. Then, a simple linear regression is run using an independent variable (x-axis variable) chosen by the user against the dependent variable price. Data transformations: logarithms, square root, or reciprocal are available for either variable which may improve the linear relationship. Finally, a scatterplot with the linear regression line is presented.
Maria lives in Barcelona and wants to rent out her apartment while she travels for two weeks. She decides to use Airbnb.com, and her friend has offered to manage the rentals while she is away. Maria wants a listing price that is competitive, but not too low. She knows her apartment is in a very desirable neighourhood for tourists. To guide in the price selection, she wants to compare her apartment to similar Airbnb listings. To maximize occupancy, she wants to write a description emphasizing characteristics strongly related to higher listing prices. Maria visits our dashboard and uses the data exploration section to see the distribution of listing prices for her neighborhood. To see the average listing price for comparable apartments, she uses the data analysis section to first eliminates noncomparables and outliers. Then, she selects an x-axis variable and can use her apartment's characteristics to see the mean listing price for similar listings. She decides to charge 10% over the mean. By repeating this process for a few different independent variables, she arrives at a reasonable price and knows which characteristics to stress in her listing.
There are two ways to view the dashboard.
Run the dashboard locally by taking the following steps:
Clone this repository locally
From the main repo directory in Terminal, run the dashboard by entering the command: Rscript app.R
To view the dashboard:
copy the number sequence of the form: 127.0.0.1:8050
that appear after start:
in the terminal
paste the number sequence into the address bar of a web browser such as chrome or firefox