Closed jbourak closed 3 years ago
Per this feedback, I think it was due to a disconnection between the report and the analysis. I sort of generated my own idea when I was writing the report and mentioned the overfitting part without verifying whether it was in our workflow.
However, this is a good point and could potentially improve our overfitting situation. This is also the same idea that Ella pointed out in her peer review for our project. We could simply add a "stop_words" argument in the countvectorizer to address this issue.
@Andrew-Tan @yzr1996 @arashshams
We need to fix the hyperlink in the final report near:
For more details on the model selection process, see the model comparison report.
You mention in part III, 2. that "we might want to avoid overfitting to these words as we train our model" but I can not see where/if you have done this in your analysis. Be explicit about what you will do (or would like to do) about the issue you brought up here.