Closed gdalle closed 1 year ago
Thanks for your suggestions and doing the JOSS review. I'll use these comments to document the responses to the issues that you raised.
In addition to a link, it would be nice to provide quantitative arguments for the claims found eg. in the docs:
Yes you're right about the use of "this implementation is fast" in the documentation. Apart from optimizing a bit (e.g., https://github.com/rikhuijzer/SIRUS.jl/commit/7ef3cb59ad9c7a14c97234f91933dbaf883b7fbb, https://github.com/rikhuijzer/SIRUS.jl/commit/60b56b2ce52a4816d8aeb209b886c822ade99cd7, https://github.com/rikhuijzer/SIRUS.jl/commit/5797c8aa6bacb0e8951b1d467a1c3b5ac8376ba2, https://github.com/rikhuijzer/SIRUS.jl/commit/82c90153cb484f4b45d19275b57f37a0555b8f8c, and https://github.com/rikhuijzer/SIRUS.jl/commit/141d61b4b4ef2e3528381a8d0381d32cc474712c), I haven't spent much time on it so it's best to not make any claims there. I've removed the sentence from the docs (https://github.com/rikhuijzer/SIRUS.jl/commit/d502a4e15a832920e1135d9dcf3c70c5139b1e6d).
I've added quantitative arguments in the form of lines of code (5f6bbd142fcd31c4aa9da871104a775e304ef0c5). It's a really nice idea to add those metrics since it clearly shows the difference between C++ and R (10k + 2k = 12k lines of code) versus Julia (2k lines of code). I've obtained these numbers via
~/git/sirus (master) > wc -l src/*
6 src/AAA_check_cpp11.cpp
275 src/Data.cpp
235 src/Data.h
48 src/DataChar.cpp
71 src/DataChar.h
69 src/DataDouble.h
36 src/DataFloat.cpp
65 src/DataFloat.h
84 src/DataRcpp.h
43 src/DataSparse.cpp
75 src/DataSparse.h
1035 src/Forest.cpp
273 src/Forest.h
334 src/ForestClassification.cpp
77 src/ForestClassification.h
342 src/ForestProbability.cpp
77 src/ForestProbability.h
255 src/ForestRegression.cpp
56 src/ForestRegression.h
362 src/ForestSurvival.cpp
71 src/ForestSurvival.h
106 src/globals.h
4 src/Makevars
4 src/Makevars.win
399 src/rangerCpp.cpp
118 src/RcppExports.cpp
596 src/Tree.cpp
191 src/Tree.h
762 src/TreeClassification.cpp
111 src/TreeClassification.h
766 src/TreeProbability.cpp
120 src/TreeProbability.h
723 src/TreeRegression.cpp
96 src/TreeRegression.h
910 src/TreeSurvival.cpp
117 src/TreeSurvival.h
589 src/utility.cpp
537 src/utility.h
39 src/utilityRcpp.cpp
10077 total
~/git/sirus (master)> wc -l R/*
46 R/formula.R
708 R/ranger.R
19 R/RcppExports.R
856 R/sirus.R
550 R/sirus_utility.R
74 R/utility.R
2253 total
~/git/SIRUS.jl (main) > wc -l src/*
90 src/classification.jl
268 src/dependent.jl
62 src/empiricalquantiles.jl
385 src/forest.jl
27 src/helpers.jl
403 src/mlj.jl
59 src/regression.jl
408 src/rules.jl
87 src/ruleshow.jl
47 src/SIRUS.jl
97 src/weights.jl
1933 total
I couldn't find the original implementation linked in any of the papers by the SIRUS authors, did I miss something?
A link has been added to the docs (https://github.com/rikhuijzer/SIRUS.jl/commit/d502a4e15a832920e1135d9dcf3c70c5139b1e6d) and paper (https://github.com/rikhuijzer/SIRUS.jl/pull/36/commits/b60215d7c0fa65f464e8ccffecf014f9a8131064).
very cool!
I couldn't find the original implementation linked in any of the papers by the SIRUS authors, did I miss something?
In addition to a link, it would be nice to provide quantitative arguments for the claims found eg. in the docs:
https://github.com/openjournals/joss-reviews/issues/5786