Closed parismita closed 5 years ago
should I set the seed in the function itself or in the examples of the doc?
in the examples
usually it is bad practice to provide a function which sets the seed
On Wed, Jun 5, 2019 at 12:47 AM parismita notifications@github.com wrote:
should I set the seed in the function itself or in the examples of the doc?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aldro61/mmit/pull/35?email_source=notifications&email_token=AAHDX4XFXLJ4LKR5MP4B75DPY5TUZA5CNFSM4HE4P5W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODW63GUY#issuecomment-498971475, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHDX4WGU7VOSPC6NNMCRSDPY5TUZANCNFSM4HE4P5WQ .
i'll update the tutorials.rmd to include more details....currently it only has the user guidelines
should I include details about the splitting algorithm, minimum cost complexity pruning and algorithm of random forest etc?
the more details that you can include (somewhere) the better
maybe it is more appropriate to include some details on the help pages, and then reference those pages from the vignette. your call.
@tdhock reference as in? should I add the details in vignettes and then give its reference in help page?
why is it a problem if "the tree will vary each run" ?
is there some other way of coding these functions, so that the seed can be set in the example? That really is standard (and what we should do)
On Wed, Jul 3, 2019 at 10:54 AM parismita notifications@github.com wrote:
@parismita commented on this pull request.
In Rpackage/R/fitAndScore.R https://github.com/aldro61/mmit/pull/35#discussion_r300085828:
@@ -1,6 +1,6 @@ .fit_and_score <- structure(function(target.mat, feature.mat, parameters, n_folds = 3, scorer = NULL,
- learner = NULL, pruning = TRUE){
- learner = NULL, pruning = TRUE, seed = NULL){
the random function i used multiple times in fitAndScore function, we need to set the seed multiple times....hence if we do set.seed in example....it will set seed for the topmost sample function...hence ultimately the tree will vary with each run
Also fitAndScore is a helper function.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aldro61/mmit/pull/35?email_source=notifications&email_token=AAHDX4T232MDANGME63BLOLP5TRTPA5CNFSM4HE4P5W2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOB5NWUBQ#discussion_r300085828, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHDX4TSYX32DAE7AISO2TTP5TRTPANCNFSM4HE4P5WQ .
i'll check it once. Might be I misunderstood the problem, as this is a very common practice to set seed of random forest in examples...
@tdhock The only problem coming is in crossvalidation where we used future_lapply. I am unable to set the seed, even after using L'Ecuyer's RNGStream. future has argument as future.seed...maybe we'll have to use that.
ok
@tdhock Should I delete the examples that were supposed to be converted into Rd by inlinecodes as I have included examples in the Roxygen code.
Also, should we completely remove all the examples or keep the ones used in helper functions?