betterscientificsoftware / bssw.io

Better Scientific Software Homepage
https://bssw.io
Other
134 stars 86 forks source link

What is Research Software #479

Open markcmiller86 opened 4 years ago

markcmiller86 commented 4 years ago

I think we need a what is on this topic

rinkug commented 4 years ago

By definition, research software is software that is used to aid research. I am curious as to why we want a what-is on this...a quick what-is may be easy to write but perhaps superfluous?

markcmiller86 commented 4 years ago

Should we include a sub-section on what a RSE (research software engineer) is?

markcmiller86 commented 4 years ago

Include some details about how research software is different from other software projects our community may be familiar with maybe even including some leading examples of each.

markcmiller86 commented 4 years ago

Do ASCR and NNSA (two software halves of DOE) have a common understanding of this?

rinkug commented 4 years ago

@bernhold to contact potential authors (based on 4-16-2020 meeting minutes) for this topic.

bernhold commented 4 years ago

Reached out to contact@us-rse.org (Steering Committee) for help with this.

rinkug commented 1 year ago

@bernhold and @markcmiller86 : Is this issue sufficiently addressed in https://bssw.io/items/what-is-a-research-software-engineer ?

If not, what is missing and what else can be done to close this?

We dont really a topic called "Research Software" to write a what-is. Although we can write an original article/cc and title it "what is research software" or something similar

markcmiller86 commented 1 year ago

I am fine closing.

FWIW, I've always been a little perplexed by the "RSE" (mainly the "R") designation. The "SE" I am good with. In fact, I prefer SE over CS. I see myself more as an engineer and not a "scientist". I've always wondered what about the "R" part makse the SE we do any different from any other SE. Now, I think scientific software engineering ("SSE" and the software engineering that is somewhat specific to the needs of scientific software), I can wrap my head around. But, "RSE" just leaves me wondering. I feel like the term "research" is thrown around way too much and generally applied to anything that involves investigating something whether or not that investigation requires significant subject matter expertise, domain knowledge, etc. in order to even plan a set of tasks that would result in completing work of the thing being invesigated.

bernhold commented 1 year ago

@rinkug even if we don't have a topic (I'm not advocating for one), we can have an article entitled "What is Research Software". And I still think we should have such an article.

Above, in https://github.com/betterscientificsoftware/bssw.io/issues/479#issuecomment-1510430920, although he's not talking about "research software" per se, @markcmiller86's confusion about the terminology in this space is why I think we need articles like these.

And to respond more directly to @markcmiller86... Personally, I tend to think of RSE as "the engineering of research software". So the R and S are more tightly bound than the S and the E. In other words, RSEs focus on "research software". I think of this as being synonymous with "scientific software". Other people, more thoughtful than me, try to distinguish those two terms, but that has not yet sunk into my brain. So let's set that aside.

I would say that, to a large extent, software engineering for research software is the same as software engineering for other kinds of software. But, to the extent that research software may be unlike some other classes of software (e.g., commercial or business/enterprise software), there will be some differences in how software engineering is practiced on scientific software vs other types. Just in the same way that safety-critical software is treated differently than non-critical software. I am not aware that there are solid research studies of exactly how research software is different than enterprise software, for example. We point to some obvious things in our BSSw tutorial modules. But the need for more study of this was also something that was recognized in the SSSDU workshop, for which we're still trying to get the report out.

markcmiller86 commented 10 months ago

Capturing a ChatGPT staring point here..

Research software in the High-Performance Computing (HPC) and Computational Science and Engineering (CSE) community refers to software applications, libraries, and tools that are specifically designed and developed to support and advance scientific research in various domains. These software tools are typically used to perform complex simulations, data analysis, and modeling in fields such as physics, chemistry, biology, engineering, climate science, and many others.

Here are some key characteristics and aspects of research software in the HPC/CSE community:

  1. Scientific Computing: Research software is often used for scientific computing tasks, including solving mathematical equations, running simulations, and processing large datasets to gain insights into complex scientific phenomena.

  2. Performance Optimization: HPC research software is optimized for performance, taking advantage of parallel processing, vectorization, and other techniques to make the most efficient use of high-performance computing resources, such as supercomputers and clusters.

  3. Parallelism: Many research software packages are designed to harness parallelism, allowing them to distribute computational tasks across multiple processors or nodes in a high-performance computing environment. This is crucial for handling large-scale simulations and data analysis.

  4. Open Source and Community-Driven: A significant portion of research software in HPC/CSE is open-source, meaning the source code is freely available for inspection, modification, and distribution. Community-driven development encourages collaboration and peer review.

  5. Domain-Specific: Research software is often tailored to specific scientific domains. For example, software used in computational fluid dynamics will be different from that used in quantum chemistry. These tools are specialized to address the unique challenges of their respective fields.

  6. Validation and Verification: Research software must be rigorously validated and verified to ensure the accuracy of results. This involves comparing software-generated results with experimental or theoretical benchmarks.

  7. Reproducibility: Reproducibility is crucial in scientific research. Research software should be designed with this in mind, allowing others in the research community to replicate and validate results using the same software and data.

  8. Data Management: Many research software tools deal with large datasets. Proper data management and data handling capabilities are often integrated into these tools to efficiently process and store data.

  9. Interoperability: In some cases, research software needs to be able to work with other software tools or data formats, ensuring that it can be integrated into larger research workflows.

  10. User Support and Documentation: Effective user support and comprehensive documentation are essential for the successful adoption and use of research software. Users should have access to clear instructions and assistance when needed.

  11. Longevity and Sustainability: Maintaining research software over the long term is crucial for the continuity of scientific research projects. Sustainable funding models and community involvement are essential to keep the software up-to-date and relevant.

In summary, research software in the HPC/CSE community plays a critical role in advancing scientific knowledge by providing the computational tools needed to perform complex simulations and data analysis. These software packages are highly specialized, performance-optimized, and often open source, contributing to the reproducibility and collaboration within the scientific community.

markcmiller86 commented 10 months ago

And, another ChatGPT answer about how does research software differ from commercial software used by the general population?

Research software in the HPC/CSE community differs from commercial software applications used by the general population in several key ways:

  1. Purpose and Target Audience:

    • Research Software: It is primarily designed for scientific research and computational tasks in specialized domains such as physics, chemistry, biology, engineering, and climate science. It caters to researchers, scientists, and engineers who require high-performance and domain-specific tools.
    • Commercial Software: Commercial software applications target a broad user base, including individuals, businesses, and organizations. They often have general-purpose features and are designed for everyday tasks, such as word processing, spreadsheet management, graphic design, and more.
  2. Domain Specificity:

    • Research Software: Research software is highly specialized and tailored to specific scientific disciplines. It may have complex algorithms and models that cater to the unique requirements of scientific simulations and data analysis in those fields.
    • Commercial Software: Commercial software is typically designed to address general computing needs and is not specialized for any particular scientific or engineering domain.
  3. Performance Optimization:

    • Research Software: Research software often places a strong emphasis on performance optimization, taking advantage of parallel processing and other high-performance computing techniques to handle large-scale simulations and data analysis efficiently.
    • Commercial Software: While commercial software may strive for good performance, it usually does not require the same level of performance optimization as research software.
  4. Open Source vs. Proprietary:

    • Research Software: Many research software packages are open source, meaning their source code is openly available for inspection and modification by the community. This fosters collaboration and transparency.
    • Commercial Software: Commercial software is typically proprietary, and its source code is not available to users. It is sold under licenses, and users have limited control over customization.
  5. Validation and Reproducibility:

    • Research Software: Research software must adhere to rigorous validation and verification processes to ensure the accuracy of scientific results. Reproducibility is a key concern, allowing other researchers to replicate experiments and validate findings.
    • Commercial Software: While commercial software is expected to be reliable, it may not undergo the same level of scientific scrutiny and validation as research software.
  6. Community and Support:

    • Research Software: Research software often benefits from active communities of researchers who contribute to its development, provide support, and offer documentation and best practices.
    • Commercial Software: Commercial software typically relies on customer support services provided by the software company.
  7. Cost and Licensing:

    • Research Software: Many research software tools are freely available or offered at reduced costs to academic institutions. They may have more permissive licensing terms.
    • Commercial Software: Commercial software is sold at market prices and often requires organizations to pay for licenses based on the number of users or the features they require.
  8. Longevity and Sustainability:

    • Research Software: Sustainability can be a concern, and the longevity of research software depends on factors such as funding, community support, and the evolving needs of researchers.
    • Commercial Software: Commercial software is typically maintained and updated by the software company as long as it remains profitable and relevant to its user base.

In summary, research software in the HPC/CSE community is specialized, open source (often), optimized for performance, and focused on scientific research, while commercial software is designed for broader, general-purpose use, often with a proprietary licensing model and a different set of priorities.

markcmiller86 commented 10 months ago

So, both ChatGPT responses above are good but possible a bit long for our needs here. How about I propose a shortened version of the above for this what is article?

rinkug commented 2 months ago

@markcmiller86 : From the conversation today, perhaps you can write a blog on this topic? Since we now have many tracks, a blog makes more sense to me.