UBC-MDS / 522-Group_30-Rockstars

DSCI 522 Data Science Workflows team project
MIT License
0 stars 8 forks source link

Milestone #2 meeting notes #18

Closed calsvein closed 3 years ago

calsvein commented 3 years ago

Script 2. Cal. Will prioritize as scripts 3 and 4 depend on Script 2. Cal to drop some columns and do a couple column mutations.

Script 3. William. The script will do the EDA from Milestone1 and save a number of .png files of tables/graphs. This includes: train_test_split (use random_state = 1234) the violin plot, correlation heat map, histogram of the target variable, scatterplot of target variable and square footage (try to turn it into a 2d histogram if possible), and the .info table (in order to demonstrate missing values)

-Script 4. Jordan. This script will involve: train_test_split (use random_state = 1234), creating feature types, feature transformers, column pipeline transformer, preprocessing on X_train. Get the R^2-score scores using linear regression and ridgecv (if time permits... Tiffany suggested this could come in a future milestone). Then get the model fit results on test data. Get a table of model named features, coefficient weights, and p-values (if possible).

-Script 5. **Daniel is the lead, Cal will give secondary support. We need 1-2 pages of narrative write-up. There are sections in here that William and Jordan** should write based on their work. Remaining work will be written up by Daniel/Cal.

Next meeting: Friday at 1030am Vancouver time