ds4se / chapters

Perspectives on Data Science for Software Engineering
59 stars 33 forks source link

Minku's How You Learn Review #105

Closed nataliajuristo closed 8 years ago

nataliajuristo commented 8 years ago

Review template

Before filling in this review, please read our Advice to Reviewers.

(If you have confidential comments about this chapter, please email them to one of the book editors.)

Title of chapter

How You Learn [Data Models] Does Matter and There Are Many Ways You Can Do This

URL to the chapter

https://github.com/ds4se/chapters/blob/master/minkull/learning-styles.md

Message?

What is the chapter's clear and approachable take away message?

Different types of learning algorithm work for specific situations (problem, data and environment)

Accessible?

Is the chapters written for a generalist audience (no excessive use of technical terminology) with a minimum of diagrams and references? How can it be made more accessible to generalist?

It is accessible to a broad audience. No need of previous knowledge to get the message.

Try to use the complete term when calling things. For instance, do not use the short “algorithms” but the full name “learning algorithms” (or whatever). Similarly with models. Might be others.

Try to be consistent with the terms and avoid synonyms. Following with the same example, choose a term (learning algorithms or machine learning algorithms, or whatever it is) and then use it consistently along the chapter.

Size?

Is the chapter the right length? Should anything missing be added? Can anything superfluous be removed (e.g. by deleting some section that does not work so well or by using less jargon, less formulae, lees diagrams, less references).? What are the aspects of the chapter that authors SHOULD change?

Right length Nothing missing Nothing to remove I like as it is

Gotta Mantra?

We encouraged (but did not require) the chapter title to be a mantra or something cute/catchy, i.e., some slogan reflecting best practice for data science for SE? If you have suggestion for a better title, please put them here.

Examine your problem first, then choose the type of algorithm to consider.

Again include here the full term (learning algorithm?). Change consider in your mantra by the proper name of the task (data analysis?)

Regarding title, I would use something more understandable by a broad audience. Something on the lines of: the learning algorithm you use for data analysis affect results you get. So in the title the context is clear.

Best Points

What are the best points of the chapter that the authors should NOT change?

Sections correspond to questions to ask when selecting learning algorithms.

meido commented 8 years ago

Title of chapter

How You Learn [Data Models] Does Matter and There Are Many Ways You Can Do This

URL to the chapter

https://github.com/ds4se/chapters/blob/master/minkull/learning-styles.md

Message?

Several different types of data are discussed and the ML algorithms that would best suit the needs are presented.

Accessible?

Is the chapters written for a generalist audience (no excessive use of technical terminology) with a minimum of diagrams and references? How can it be made more accessible to generalist?

Small issues:

Size?

The size of the chapter is good.

Gotta Mantra?

Looks good to me.

Best Points

The different questions that were asked and answered are the key contributions.

timm commented 8 years ago

I've nothing to add to the above. Well written chapter!

GRuhe commented 8 years ago

Tim,

Thanks for all the recent messages ☺ … any chance to change the process and directing them ONLY to the authors of the respective article?

G.

From: Tim Menzies [mailto:notifications@github.com] Sent: December-23-15 10:29 AM To: ds4se/chapters chapters@noreply.github.com Subject: Re: [chapters] Minku's How You Learn Review (#105)

I've nothing to add to the above. Well written chapter!

— Reply to this email directly or view it on GitHubhttps://github.com/ds4se/chapters/issues/105#issuecomment-166950331.

minkull commented 8 years ago

Thanks a lot for the comments. I've just revised the chapter.

Happy New Year!

minkull commented 8 years ago

Response to Natalia's comments:

Try to use the complete term when calling things. For instance, do not use the short “algorithms” but the full name “learning algorithms” (or whatever). Similarly with models. Might be others.

Thanks for pointing that out. I'm now always using "learning algorithm", "data models" and "data analytics".

Examine your problem first, then choose the type of algorithm to consider.

Again include here the full term (learning algorithm?). Change consider in your mantra by the proper name of the task (data analysis?)

Changed to "examine your data analytics problem first, then chose the type of learning algorithm to consider".

Regarding title, I would use something more understandable by a broad audience. Something on the lines of: the learning algorithm you use for data analysis affect results you get. So in the title the context is clear.

This is a nice suggestion of title too. I've added the potential title "The Learning Algorithm You Use for Data Analytics Affects the Results You Get" to the chapter. I'm happy for the editors to use either of these titles.

Response to Mei's comments:

Small issues:

"investigate the problem in hands before" -> "investigate the problem in hand before"

Changed. Thanks for pointing that out!

In the first question, an excellent example of new data is given. It would be great, if there is an example of what researchers have done with this data in the past or what the goal for modelling with this data is could be very useful. Same with the next question too.

I've added the example of prediction of crash-prone commits for the first question, and software defect prediction and software effort estimation for the second question.

This sentence is too long with many different facets of information. Maybe break it down? - "When temporal information about the data is available, change detection techniques can be used in combination with online or chunk-based algorithms (Gama and Gaber 2007) to identify when a change that affects the adequacy of the current model is occurring (Minku and Yao 2012a), which existing models best represent the current situation (Minku and Yao 2012b) and how to update models to the new situation (Minku and Yao 2012ab, 2014). "

It is indeed very long. I've changed it to:

"When temporal information about the data is available, change detection techniques can be used in combination with online or chunk-based learning algorithms (Gama and Gaber 2007) to handle changes. For instance, they can be used to (1) identify when a change that affects the adequacy of the current data model is occurring (Minku and Yao 2012a), (2) determine which existing data models best represent the current situation (Minku and Yao 2012b) and (3) decide how to update data models to the new situation (Minku and Yao 2012ab, 2014)."

"but a lot of data from other softwares" -> "but a lot of data from other software"

Changed. Thanks for pointing that out!

timm commented 8 years ago

@minkull thanks for the revision. the issue of title remains.

How about: Which data mining method do you need?

leave it to you. But try to make the current one shorter, more to the point. it should jump off the table of contents and make people rush to read your chapter.

minkull commented 8 years ago

@timm Hi Tim, I'm fine with "Which data mining method do you need?"

timm commented 8 years ago

:+1: @minkull Good to go!