mhahsler / Introduction_to_Data_Mining_R_Examples

R Code to accompany the book Introduction to Data Mining by Tan, Steinbach and Kumar (Code by Michael Hahsler)
https://mhahsler.github.io/Introduction_to_Data_Mining_R_Examples/
Creative Commons Attribution Share Alike 4.0 International
105 stars 64 forks source link

Review Items #6

Open stats-tgeorge opened 2 months ago

stats-tgeorge commented 2 months ago

@mhahsler

First of all, there is so much content here. Thank you for your work.

Documentation

Pedagogy / Instructional design (Work-in-progress: reviewers, please comment!)

JOSE Paper

General Comments (do not have to be addressed)

mhahsler commented 2 months ago

@stats-tgeorge

Dear reviewers and editor,

Thank you for the very useful comments! My responses are typeset as blockquotes.

I was initially under the impression that there was a strict 2-page limit, including the references. However, it looks like the limit is 1000 words, so I was able to provide more information in the paper to address all the comments.

Some general changes are:

Responses to the individual comments:

Documentation

In the "Statement of Need," I now specify, "This resource targets advanced undergraduate and graduate students and can be used as a first introduction to data mining." I believe there are better sources out there for statistics and data science courses.

I have rewritten the "Instructional Design" section to focus more on how an instructor would integrate it into a course.

Pedagogy / Instructional design (Work-in-progress: reviewers, please comment!)

I added a new section called "Learning Objectives and Content," which now provides more details on the content.

When I reorganized Chapter 4, I added an explanation of how to read the models' output and a baseline model for the section Model Comparison.

The explanation for the warning was not in the right place. I have moved it to right after the warning ocurrs.

This is now resolved.

JOSE Paper

See response above.

I have added a short list in section "Learning Objectives and Content".

I have added one missing DOI. The textbook Introduction to Data Mining has no DOI.

I have added a "Story of the Project" section to the paper.

General Comments (do not have to be addressed)

I have added across() with an example and an explanation to Chapter 1.

Thank you for bringing this very well-made dataset to my attention. I use it now through the resources for exercises (see the Exercises sections at the end of each Chapter).

Good idea. I have renamed the section.

I went through the chapters and added library() calls to the subsections where a package is used. The code in each chapter is now self-contained, and most sections can be run as long as tidyverse and the data set is loaded.

Yes, this ordering is due to the textbook. Changing the order would make it more confusing for students following the textbook.

This is true. tidymodels is on my list to add in the future, and I will probably produce a second version with it at some point.

stats-tgeorge commented 2 months ago

@mhahsler

Looking good! There is an inconsistency in your Statements of Need. You have them on your website, your repo readme, and your JOSE paper and they do not all match. I believe your JOSE paper version is your most updated. Can you update the other locations?

mhahsler commented 2 months ago

@stats-tgeorge thank you for catching this. I have updated the statement in the README on GitHub.

https://github.com/mhahsler/Introduction_to_Data_Mining_R_Examples

Is the version on my website, the github.io verions that is linked from my homepage? It automatically updates GitHub with changes.

https://mhahsler.github.io/Introduction_to_Data_Mining_R_Examples/

stats-tgeorge commented 3 weeks ago

@mhahsler in your paper references (bib file), the Kuhn & Max reference is formatted differently. Please fix this.

mhahsler commented 3 weeks ago

Interesting. The author field seems to be broken in the caret package.

> citation("caret")
To cite caret in publications use:

  Kuhn, M. (2008). Building Predictive Models in R Using the caret Package. Journal of Statistical Software,
  28(5), 1–26. https://doi.org/10.18637/jss.v028.i05

A BibTeX entry for LaTeX users is

  @Article{,
    title = {Building Predictive Models in R Using the caret Package},
    volume = {28},
    url = {https://www.jstatsoft.org/index.php/jss/article/view/v028i05},
    doi = {10.18637/jss.v028.i05},
    number = {5},
    journal = {Journal of Statistical Software},
    author = {{Kuhn} and {Max}},
    year = {2008},
    pages = {1–26},
  }

I have fixed the author field in the bibtex file and will let Max know.