we talked about how script #1 works and made a draft for script #2.
we agreed to divide work as follows:
Script 2. Cal. Will prioritize as scripts 3 and 4 depend on Script 2. Cal to drop some columns and do a couple column mutations.
Script 3. William. The script will do the EDA from Milestone1 and save a number of .png files of tables/graphs. This includes: train_test_split (use random_state = 1234) the violin plot, correlation heat map, histogram of the target variable, scatterplot of target variable and square footage (try to turn it into a 2d histogram if possible), and the .info table (in order to demonstrate missing values)
-Script 4. Jordan. This script will involve: train_test_split (use random_state = 1234), creating feature types, feature transformers, column pipeline transformer, preprocessing on X_train. Get the R^2-score scores using linear regression and ridgecv (if time permits... Tiffany suggested this could come in a future milestone). Then get the model fit results on test data. Get a table of model named features, coefficient weights, and p-values (if possible).
-Script 5. **Daniel is the lead, Cal will give secondary support. We need 1-2 pages of narrative write-up. There are sections in here that William and Jordan** should write based on their work. Remaining work will be written up by Daniel/Cal.
we talked about how script #1 works and made a draft for script #2.
we agreed to divide work as follows:
Script 2. Cal. Will prioritize as scripts 3 and 4 depend on Script 2. Cal to drop some columns and do a couple column mutations.
Script 3. William. The script will do the EDA from Milestone1 and save a number of .png files of tables/graphs. This includes: train_test_split (use random_state = 1234) the violin plot, correlation heat map, histogram of the target variable, scatterplot of target variable and square footage (try to turn it into a 2d histogram if possible), and the .info table (in order to demonstrate missing values)
-Script 4. Jordan. This script will involve: train_test_split (use random_state = 1234), creating feature types, feature transformers, column pipeline transformer, preprocessing on X_train. Get the R^2-score scores using linear regression and ridgecv (if time permits... Tiffany suggested this could come in a future milestone). Then get the model fit results on test data. Get a table of model named features, coefficient weights, and p-values (if possible).
-Script 5. **Daniel is the lead, Cal will give secondary support. We need 1-2 pages of narrative write-up. There are sections in here that William and Jordan** should write based on their work. Remaining work will be written up by Daniel/Cal.
Next meeting: Friday at 1030am Vancouver time