Open nooreendabbish opened 8 years ago
For coding/recoding predictors, see lines 15 to 33 of drowsy.models.R in the Patrick folder. Note that I ultimately used SUMMER instead of SEASON. I think we can do better than either of them: -spring and summer were both significant -- not just summer. Perhaps we should create a two-level variable, where one level is spring+summer and the other is fall+winter. -We can probably do better than using a cutpoint at the month-level. Seasons change in the MIDDLE of a month, and we have a day variable, so we can dig deeper. I think we should take an exploratory step of plotting frequency of drowsyness through the year down by going down to the day level (perhaps binning by week?). This might also reveal a good way to create a time-of-year variable with more than 2 levels/periods (ideally showing significance for each level/period?)
For model building/selection, use fwd.stepwise.AIC.R in the Patrick folder. Note how I define the response string and predictor vector up top. We want to load a dataset that subsets by night, instead of the one I listed up top. Then we need to create a new svydesign object. I'm a bit confused about whether we have to create a new svydesign object each time we subset. If we can just create the svydesign object for the full data set and use it for any subset (AND when we add on new calculated columns), that would be ideal.
I have not yet made the change to the fwd.stepwise.AIC function that we discussed in seminar yesterday. Give it a go if you have time (or I can do it perhaps tomorrow). If you can't get the function to return, then just use the "guts" of the function after appropriately renaming your objects.
I'll look through the code now, thanks for the pointers.
I wonder if we should open season as an issue also, lol.
The only way I would know to test if you need to re-create the survey design object would be to change subset and test, then re-create and see if the results are the same.
I am interested in repeating some of Patrick's analysis on nighttime drivers so see if the same factors are important.