JacquesCarette / Drasil

Generate all the things (focusing on research software)
https://jacquescarette.github.io/Drasil
BSD 2-Clause "Simplified" License
143 stars 26 forks source link

Chapter 1 Introduction Ready for Review #3262

Closed tingyuw closed 1 year ago

tingyuw commented 1 year ago

Chapter 1 Introduction is ready for review. Please see the first chapter in Thesis. Please let me know what you think. Thank you!

JacquesCarette commented 1 year ago

Comments:

Structurally, I like how the next parts are Background on Drasil and Jupyter, to make the Problem Statement understandable to the reader.

However, I did not review those parts, as the "1.0" part needs so much work, I think that needs to be fixed first. Because these other parts, 1.1 and 1.2, need to logically follow from the earlier parts, and make sense with it as a whole. Please fix this, and then ask again for a review.

tingyuw commented 1 year ago

If following the same structure that 1.1 is background on Drasil and Jupyter and 1.2 is the problem statement, would the first paragraph be like a quick summary of both background information and problems? You mentioned that the introduction should start with something like what are Jupyter notebooks, and since this information is presented in 1.1, I was wondering if a summary of them would make sense or the first paragraph should be something else. Or the contents should be structured differently (e.g., move 1.1 to the first paragraph?)

JacquesCarette commented 1 year ago

1.0 (i.e. what is before 1.1) should provide context and very high level ideas of what's coming. It's not a summary of 1.1 and 1.2, but rather a foreshadowing, to let the reader know they'll need the deeper details provided in 1.1 and 1.2 to really understand what your thesis is about.

smiths commented 1 year ago

@tingyuw we have some documents on writing on the se4sc repo, including a list of previous MEng report samples.

tingyuw commented 1 year ago

Would it make more sense if I start with an introduction of scientific computing and something like it requires time and effort for developing high quality software and documentation; Jupyter Notebook is an application for creating scientific computing documentation. What are the problems for writing the documentation manually, and how generative programming improves it, then get into 1.1 the introduction of Drasil?

JacquesCarette commented 1 year ago

It makes sense, but don't make that too long. Get to Jupyter quickly. It does make sense to have a proper description of Jupyter (what it is, who uses it, its growth) to show the relevance of your work.

Yes about problems for writing the documentation manually (with citations). The "generative programming improves it" is a claim that needs to be justified, and not a conclusion / fact. So it's part of the research itself. You can and still should talk about it, but using different words.

tingyuw commented 1 year ago

Do you have any suggested paper to read that talks about the problems of writing the documentation manually or just about the repetition work for developing scientific computing documentation in general? I found some that talks about the best practice for developing scientific software but do not cover much about why it's time-consuming and inefficient.

JacquesCarette commented 1 year ago

I don't, off the top of my head. I think there are papers by Diane Kelly that @smiths would know better, that do comment on that.

tingyuw commented 1 year ago

I did find some papers by Diane Kelly and also one that talks about the implementation challenges in ML documentation. I guess there are some similarities.

I have the updated version ready in Thesis. Please review chapter one. Thank you. @JacquesCarette @smiths

smiths commented 1 year ago

@tingyuw there are a few papers that discuss the challenges for writing scientific software. There are papers that discuss documentation being difficult, but I can't think of a paper that explicitly says it is repetitive. I'll mention some papers below. They can all be found in our pub repo in the References.bib file. I'll identify them by the same citation key as in the bib file.

  1. WieseEtAl2019, Naming the Pain in Developing Scientific Software: "Moreover, in spite of its importance, we perceived that writing a good ​Documentation also poses a major issue. For instance, one respondent mentioned that “​I work with biologists that have little knowledge of programming. Therefore, the software must be easy to use. Write a clear documentation is always a challenge​”. ​Documentation issues are 8.7% of the total problems raised by the respondents, considering the technical, social and scientific-related problems group."
  2. HeatonAndCarver2015, Claims About the Use of Software Engineering Practices in Science: "Documentation requires a significant investment of work. Not all of the claims about documentation were positive. The effort required to create documentation leads some developers to conclude that scientific software developers should be careful about how much documentation they create [25,59]. Furthermore, if the documentation is done to satisfy an external requirement it may not benefit the project team [25,59]. However, one study did find an alternative to performing a separate documentation task and utilized an automatic documentation generator that creates documentation from comments in the project’s source files [35]."
  3. Nguyen-HoanEtAl2010, A Survey of Scientific Software Development: Figure 14 gives reasons for and against documentation. Against reasons include "Effort not worth it due to small user base."
  4. Segal2007, Some Problems of Professional End User Developers: "Professional end user developers did not voluntarily produce documentation, apart from the occasional user guide, in either our studies or those reported in [2]."
  5. SmithEtAl2022, State of the Practice for Lattice Boltzmann Method Software: "Interviewees stressed the importance of documentation for both users and developers throughout the interviews. They emphasize that a lack of time and funding (P1) has a negative effect on the documentation. Most of the developers are scientific researchers evaluated on the scientific papers that they produce. Writing and updating documentation is something that is done in their free time, if that time arises. Others also mention inadequate research software documentation (Pinto et al., 2018; Wiese et al., 2019). The problem also arises with non-research software (Lethbridge et al., 2003)."
  6. SmithAndKoothoor2016, A Document-Driven Method for Certifying Scientific Computing Software for Use in Nuclear Safety Analysis: This paper shows inconsistencies between documentation and code that are a likely consequence of reverse engineering the documentation at some point in the process, and then not updating the documentation as the development proceeds.
  7. SmithJegatheesanAndKelly2016, Advantages, Disadvantages and Misunderstandings About Document Driven Design for Scientific Software: "Systematic software engineering methods are also often not applied when writing software. In SCS “teams have tended to favor individual team members and good practices over more rigid processes and tools” [5]. This has occurred despite the fact that precise documentation provides multiple benefits, including “easier reuse of old designs, better communication about requirements, more useful design reviews, easier integration of separately written modules, more effective code inspection, more effective testing, and more efficient corrections and improvements” [6]."
tingyuw commented 1 year ago

Thank you @smiths! I'll check them out.

JacquesCarette commented 1 year ago

Comments on latest version of Ch.1:

smiths commented 1 year ago

@tingyuw, what is the status of Chapter 1? Have you incorporated the feedback from the above discussion?

tingyuw commented 1 year ago

@smiths I'm working on it! Will have it done today.

tingyuw commented 1 year ago

@smiths I have addressed the above feedback. The original link to my report should reflect the changes!

smiths commented 1 year ago

@tingyuw here is a review of Chapter 1 via a marked-up pdf file. Given that @JacquesCarette and I have both responded, I'm going to close this issue.

Ch1_Fdbck.pdf

tingyuw commented 1 year ago

Feedbacks are incorporated in https://github.com/JacquesCarette/Drasil/commit/b4d6ef6fe7a1b04ff6da4ff8379650be6d356772.