Closed PiotMik closed 2 years ago
@AniaMatysek - as you beat us to the validation part, could you post some links to resources/techniques you used, so that we can be consistent across models?
@PiotMik I've used Monte Carlo Cross Validation method as suggested during lectures and in Phillipe's example project. You can read a little bit about the idea here: https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b. Then just follow Phillipe's instructions. He mentions also K-fold Cross-Validation method. However he states that Monte Carlo is better since we can visualize more. If you accept using this method, please close this issue.
Hey @AniaMatysek, apologies, I missed this question.
No issues with MonteCarlo CV, other than computational performance problems. K-fold puts less strain on the program, so it's way faster to run.
I noticed that the pdf has been compiling for a very long time ever since we added Validation chapters - but I suggest we simply tune down the MonteCarlo nRuns
parameter when developing the project (I set it to 10 to just have it running fairly quickly), and then bring it back to normal when compiling the final pdf.
Other than that I'm happy with the choice - closing the issue, especially since you and @joannakrezel already did a fair amount of work there.
Read about model validation techniques and prepare a method(s) of choice for the problem. Techniques chosen here will be used during model-validation.
Some references to get you started: