aldro61 / mmit

Regression trees for interval censored output data
https://aldro61.github.io/mmit/
GNU General Public License v3.0
7 stars 7 forks source link

Adding Tutorials as Vignettes #35

Closed parismita closed 5 years ago

parismita commented 5 years ago

@tdhock Should I delete the examples that were supposed to be converted into Rd by inlinecodes as I have included examples in the Roxygen code.

Also, should we completely remove all the examples or keep the ones used in helper functions?

parismita commented 5 years ago

should I set the seed in the function itself or in the examples of the doc?

tdhock commented 5 years ago

in the examples

usually it is bad practice to provide a function which sets the seed

On Wed, Jun 5, 2019 at 12:47 AM parismita notifications@github.com wrote:

should I set the seed in the function itself or in the examples of the doc?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aldro61/mmit/pull/35?email_source=notifications&email_token=AAHDX4XFXLJ4LKR5MP4B75DPY5TUZA5CNFSM4HE4P5W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODW63GUY#issuecomment-498971475, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHDX4WGU7VOSPC6NNMCRSDPY5TUZANCNFSM4HE4P5WQ .

parismita commented 5 years ago

i'll update the tutorials.rmd to include more details....currently it only has the user guidelines

parismita commented 5 years ago

should I include details about the splitting algorithm, minimum cost complexity pruning and algorithm of random forest etc?

tdhock commented 5 years ago

the more details that you can include (somewhere) the better

maybe it is more appropriate to include some details on the help pages, and then reference those pages from the vignette. your call.

parismita commented 5 years ago

@tdhock reference as in? should I add the details in vignettes and then give its reference in help page?

tdhock commented 5 years ago

why is it a problem if "the tree will vary each run" ?

is there some other way of coding these functions, so that the seed can be set in the example? That really is standard (and what we should do)

On Wed, Jul 3, 2019 at 10:54 AM parismita notifications@github.com wrote:

@parismita commented on this pull request.

In Rpackage/R/fitAndScore.R https://github.com/aldro61/mmit/pull/35#discussion_r300085828:

@@ -1,6 +1,6 @@ .fit_and_score <- structure(function(target.mat, feature.mat, parameters, n_folds = 3, scorer = NULL,

  • learner = NULL, pruning = TRUE){
  • learner = NULL, pruning = TRUE, seed = NULL){

the random function i used multiple times in fitAndScore function, we need to set the seed multiple times....hence if we do set.seed in example....it will set seed for the topmost sample function...hence ultimately the tree will vary with each run

Also fitAndScore is a helper function.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aldro61/mmit/pull/35?email_source=notifications&email_token=AAHDX4T232MDANGME63BLOLP5TRTPA5CNFSM4HE4P5W2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOB5NWUBQ#discussion_r300085828, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHDX4TSYX32DAE7AISO2TTP5TRTPANCNFSM4HE4P5WQ .

parismita commented 5 years ago

i'll check it once. Might be I misunderstood the problem, as this is a very common practice to set seed of random forest in examples...

parismita commented 5 years ago

@tdhock The only problem coming is in crossvalidation where we used future_lapply. I am unable to set the seed, even after using L'Ecuyer's RNGStream. future has argument as future.seed...maybe we'll have to use that.

parismita commented 5 years ago

ok