As part of the modeling process, we want to explore what possibilities we've got with Naïve Bayes models, as well as RF. These are both on your desk Kelsey :)
Here's what I've noticed so far in the baseline. NB doesn't really have a lot of params, at least the MultinomialNB that I used. I'm not super confident that that's the NB algorithm we need, but I think it's correct. The Laplace smoothing param there just controls for how small sample sizes become... I'm not really sure what exploring that will yield.
As for RF, it was taking up ~3GB of RAM on my desktop. The more features you put in (with max_features), the longer it'll take to run. Also, max_depth affects it a lot, of course. I didn't really look into any of the other params - maybe setting a minimal leaf size will make it go faster?
Neither of these algorithms had a great performance on the dev set for me. Looking forward to you proving me wrong haha.
As part of the modeling process, we want to explore what possibilities we've got with Naïve Bayes models, as well as RF. These are both on your desk Kelsey :)
Here's what I've noticed so far in the baseline. NB doesn't really have a lot of params, at least the
MultinomialNB
that I used. I'm not super confident that that's the NB algorithm we need, but I think it's correct. The Laplace smoothing param there just controls for how small sample sizes become... I'm not really sure what exploring that will yield.As for RF, it was taking up ~3GB of RAM on my desktop. The more features you put in (with
max_features
), the longer it'll take to run. Also,max_depth
affects it a lot, of course. I didn't really look into any of the other params - maybe setting a minimal leaf size will make it go faster?Neither of these algorithms had a great performance on the dev set for me. Looking forward to you proving me wrong haha.