openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
721 stars 38 forks source link

[REVIEW]: Spiner: Performance Portable Routines for Generic, Tabulated, Multi-Dimensional Data #4367

Closed editorialbot closed 2 years ago

editorialbot commented 2 years ago

Submitting author: !--author-handle-->@Yurlungur<!--end-author-handle-- (Jonah Miller) Repository: https://github.com/lanl/spiner Branch with paper.md (empty if default branch): joss-paper Version: v1.5.1 Editor: !--editor-->@dfm<!--end-editor-- Reviewers: @lgarrison, @jzrake Archive: 10.5281/zenodo.6800124

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/361f13746822fde77cc0f6f0b0b20bbc"><img src="https://joss.theoj.org/papers/361f13746822fde77cc0f6f0b0b20bbc/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/361f13746822fde77cc0f6f0b0b20bbc/status.svg)](https://joss.theoj.org/papers/361f13746822fde77cc0f6f0b0b20bbc)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@lgarrison & @jzrake, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @dfm know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @lgarrison

📝 Checklist for @jzrake

editorialbot commented 2 years ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 2 years ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.01 s (387.9 files/s, 36361.8 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Markdown                         2             19              0            161
TeX                              1             15              0            155
YAML                             1              1              4             20
-------------------------------------------------------------------------------
SUM:                             4             35              4            336
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 2 years ago

Wordcount for paper.md is 1217

editorialbot commented 2 years ago

Failed to discover a valid open source license

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

lgarrison commented 2 years ago

Review checklist for @lgarrison

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

dfm commented 2 years ago

@Yurlungur, @lgarrison, @jzrake — This is the review thread for the paper. All of our communications will happen here from now on. Thanks again for agreeing to participate!

Please read the "Reviewer instructions & questions" in the first comment above, and generate your checklists by commenting @editorialbot generate my checklist on this issue ASAP. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention openjournals/joss-reviews#4367 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for the review process to be completed within about 4-6 weeks but please try to make a start ahead of this as JOSS reviews are by their nature iterative and any early feedback you may be able to provide to the author will be very helpful in meeting this schedule.

dfm commented 2 years ago

Failed to discover a valid open source license

Don't worry about this comment from the bot - it expects the code and paper to both be on the same branch, but this is not the case here, and not a problem!

editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.3847/1538-4365/ab007f is OK
- 10.3847/1538-4365/ab09fc is OK
- 10.1088/0264-9381/27/11/114103 is OK
- 10.3847/0004-637X/816/1/44 is OK
- 10.1109/TPDS.2021.3097283 is OK

MISSING DOIs

- Errored finding suggestions for "Sesame: The Los Alamos National Laboratory Equatio...", please try later
- Errored finding suggestions for "Stellar Collapse: Microphysics", please try later
- Errored finding suggestions for "Singularity-EOS: Performance Portable Equations of...", please try later
- Errored finding suggestions for "Singularity-Opac: Performance Portable Opacities", please try later
- Errored finding suggestions for "Phoebus: Phifty One Ergs Blows Up A Star", please try later
- 10.2307/j.ctv6wggx8.17 may be a valid DOI for title: Ports-of-Call
- Errored finding suggestions for "Catch2", please try later
- Errored finding suggestions for "Numerical Recipes with Source Code CD-ROM 3rd Edit...", please try later

INVALID DOIs

- None
Yurlungur commented 2 years ago

Thanks, @lgarrison, @jzrake , @dfm . Looking forward to interacting with you during the review process.

dfm commented 2 years ago

@lgarrison, @jzrake — This is just a little ping to make sure that this review stays on your radar. It's good to start chipping away at the checklists sooner rather than later!

jzrake commented 2 years ago

Review checklist for @jzrake

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

jzrake commented 2 years ago

@Yurlungur -- I hope I'm doing this right. Here are two bits of feedback based on my first impression from the PDF writeup.

  1. GPU hardware is designed to do interpolation (1d/2d/3d) in hardware as a primitive operation, long before GPGPU was a thing. I think the writeup should state why a software library to do interpolation on the GPU is needed, or is somehow better than what GPUs do in the hardware pipeline.
  2. The Statement of Need makes it seem like the primary purpose of the library is for reading opacity tables in hydro codes, which is a relatively specialized application. Is there something about this library which is aimed specifically at such applications, or is the library also intended to offer better (faster?) interpolation for other applications, e.g. image resampling? If it's the specific application, then the Statement of Need should mention (if it's true) that these hydro codes are currently limited either by the accuracy or performance of lookups in the opacity and/or EOS tables.
Yurlungur commented 2 years ago

Thanks, @jzrake It'll take me a little bit to gather a formal response for these comments and implement changes to the manuscript. I should have something soon.

Yurlungur commented 2 years ago

@jzrake thanks for the comments. I just updated the manuscript based on your feedback. The relevant commit is here. Here's a formal little writeup to explain the answers to your questions:

  1. It's true that hardware interpolation is required for graphics applications, and is thus a feature for GPUs. Indeed, one of our team members applied texture interpolation as early as 2007. However, we felt a software layer was required for our (and other) scientific applications for several reasons. Texture interpolation, at least on NVIDIA devices, is only single-precision, with interpolation coefficients stored at half-precission. This is often insufficient for scientific applications. Texture interpolation is also rather constrained in application, to only a few stencil patterns and to uniform data only. While Spiner currently limits itself to linear interpolation on uniform data, we wanted to leave the door open to other algorithms. Texture interpolation also does not support multi-dimensional mixed indexing/interpoaltion operations where, say, three indices of a four-dimensional array are interpolated and one is merely indexed into. Texture interpolation also, by design operates performantly on vectors of data only, rather than on a single element. While obviously GPUs are vector machines, downstream applications may want to build more complicated operations on scalar interpoaltion primitives. For example, equation of state lookups often involve a root find on interpolated data, which is easier to reason about in scalar form. Finally, the intent of Spiner is that the same code base can be used both on CPU and GPU, and on whatever comes next. In other words, that the code be portable. This necessitates a software layer of some kind. That said, a specialization of Spiner that uses hardware intrinsics when appropriate would be an interesting topic of future work. We have added comments to the manuscript emphasizing these points.

  2. We wrote Spiner out of a specific need for such a capability for equation of state and opacity data for continuum dynamics codes, and thus that has been our focus. To our knowledge there is no such standalone capability in the literature, although individual codes have certainly come up with their own internal solutions. Thus we believe our work fills a gap by being a "plug and play" capability for these codes that does not sacrifice performance. Continuum dynamics codes of this kind are broadly applicable to a large number of precision scientific applications, including but not limited to astrophysics, geophysics, climate modeling, and simulations of interest to national defense. Together these applications use up a very large number of supercomputer cycles available. That said, interpolation is of course a very broad topic and as ou point out Spiner likely has applications beyond hydro codes. However, we haven't thought very deeply about other pplications. We have added a sentence in the statement of need emphasizing that there is no other performance-portable standalone capability that we know of. We also added a comment about other applications.

jzrake commented 2 years ago

@Yurlungur thank you for the thoughtful reply. I hope this content can be included in the PDF writeup, as I think it significantly clarifies the motivation and objectives of the project.

@dfm is the PDF writeup intended to be very concise, or can it be as detailed as the reply above?

EDIT: I only just saw the additions to the PDF writeup. I'll give another round of comments soon, if I have any.

jzrake commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

dfm commented 2 years ago

@Yurlungur, @jzrake — Re: manuscript scope. Our goal here is for the manuscript to be brief with the documentation page being the primary standalone source of information, so it's generally a good idea to add some words to the documentation whenever you're extending the manuscript. I prefer the manuscript to not have any unique material (everything should be in the docs one way or another), but that's just my preference and not a requirement!

Yurlungur commented 2 years ago

Thanks for the clarification, @dfm. Currently there's no statement of need in the docs. I will add one that mirrors the discussion here and in the manuscript.

Thanks, @jzrake. Please let me know if the additions to the manuscript address concerns, or if I should further extend it based on the discussion here.

Yurlungur commented 2 years ago

☝️ documentation updated in linked PR (now merged).

lgarrison commented 2 years ago

Hi @Yurlungur! I'm working on my review and as per @dfm's suggestion, I'll ask questions as they come up.

On the performance, the results compared to the CPU look impressive, but it still feels like some additional context would be helpful. The paper motivates Spiner by saying the interpolation should not be the limiting factor in a radiation transport simulation—is that now the case? Do you have timings you can share for the interpolation step versus the total time step in a typical simulation?

Alternatively, you could estimate the achieved FLOPs compared to theoretical peak, either by using the CUDA profiler or by estimating the number of operations by hand. Either the % of peak or a comparison to the total simulation time would help contextualize your success here!

Finally, can you clarify in the writeup whether the performance test was done in single or double precision? Perhaps double, since the paper starts by discussing precision limitations in texture interpolation?

Yurlungur commented 2 years ago

Thanks, @lgarrison and also thanks for the all the issues+PRs. This week has been incredibly busy, so I haven't had a chance to go through everything in detail. But I wanted to let you know I'm on it and should have some updates soon.

lgarrison commented 2 years ago

I'm happy to say I was able to reproduce the performance benchmarks on a V100! That's all the items on my JOSS checklist. I think the last thing I would like to see addressed is my comment above about adding context for the achieved performance.

Yurlungur commented 2 years ago

Thanks for quickly digging in to all of that, @lgarrison ! I just updated the paper to address your comments. Here's a more detailed discussion:

Regarding the performance results on CPU Vs. GPU and context for the performance of Spiner: We believe that the phrase "speedup" is misleading, as the goal here is not to speed up an EOS calculation in a CPU code by offloading it to GPU, although that's a benefit of the library. Rather the goal is to provide an interpolation capability that runs natively on CPU and GPU so that a GPU-native code (for example) can simply call the interpolation routines from device kernels. We have thus removed the discussion of a "speedup" from the performance section and instead replaced it with, as you suggested, a comparison to peak performance. On GPUs, memory bandwidth is often the limiting factor, so we compare to both total peak FLOPS and peak memory movement including both reads and writes. By the former measure, we're about 15% of peak. In the latter case, we achieve about 200 GB/s both read and write. The maximum bandwidth of the device is 900 GB/s.

That said, we thank you for the suggestion that we add some context about performance in this context. We now note in the "state of the field" section that in radiation hydrodynamics calculations, root finding and microphysics subroutines, with interpolation as the innermost operation can take up to 10% of the runtime. For pure hydrodynamics calculations without expensive radiation transport, the cost of evaluating tabulated equations of state can be significantly larger. Without care, it can be as large as half the runtime.

The performance test was indeed done at double precision.

Yurlungur commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

lgarrison commented 2 years ago

Thanks @Yurlungur! I'm happy to give 👍 to publication.

Yurlungur commented 2 years ago

Thanks for the in-depth review and helpful comments, @lgarrison !

Yurlungur commented 2 years ago

@jzrake I just wanted to quickly reach out---have you had a chance to go through the revised manuscript and if so, do you have any additional comments?

jzrake commented 2 years ago

@Yurlungur, I will give you any final comments in the next 1-2 days.

Yurlungur commented 2 years ago

Sounds good, @jzrake thank you!

dfm commented 2 years ago

@jzrake — just a little ping to keep this on your radar!

jzrake commented 2 years ago

@Yurlungur, I am ready to sign off on this article. I don't have any further comments since you updated the Statement of Need. All the items on my checklist are checked, except for "performance" -- I have not compiled the code on a GPU node.

dfm commented 2 years ago

@editorialbot generate pdf

dfm commented 2 years ago

@editorialbot check references

editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.3847/1538-4365/ab007f is OK
- 10.3847/1538-4365/ab09fc is OK
- 10.1088/0264-9381/27/11/114103 is OK
- 10.3847/0004-637X/816/1/44 is OK
- 10.1109/TPDS.2021.3097283 is OK

MISSING DOIs

- 10.2307/j.ctv6wggx8.17 may be a valid DOI for title: Ports-of-Call

INVALID DOIs

- None
editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

dfm commented 2 years ago

@jzrake, @lgarrison — Thanks for your thorough and constructive reviews! I really appreciate the time you took to help with this.

@Yurlungur — I've opened a PR with some minor edits to the paper. After merging or responding to that, here are the final steps that I'll need from you:

  1. Take one last read through the manuscript to make sure that you're happy with it (it's harder to make changes later!), especially the author names and affiliations. I've taken a pass and it looks good to me!
  2. Increment the version number of the software and report that version number back here.
  3. Create an archived release of that version of the software (using Zenodo or something similar). Please make sure that the metadata (title and author list) exactly match the paper. Then report the DOI of the release back to this thread.
Yurlungur commented 2 years ago

Thanks @jzrake @lgarrison ! I appreciate the time you spent engaging with me and the project.

Thanks @dfm I've merged the PR, done a final readthrough, and incremented version number.

dfm commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

dfm commented 2 years ago

@editorialbot set 10.5281/zenodo.6800124 as archive

editorialbot commented 2 years ago

Done! Archive is now 10.5281/zenodo.6800124

dfm commented 2 years ago

@editorialbot set v1.5.1 as version

editorialbot commented 2 years ago

Done! version is now v1.5.1

dfm commented 2 years ago

@editorialbot recommend-accept

editorialbot commented 2 years ago
Attempting dry run of processing paper acceptance...
editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.3847/1538-4365/ab007f is OK
- 10.3847/1538-4365/ab09fc is OK
- 10.1088/0264-9381/27/11/114103 is OK
- 10.3847/0004-637X/816/1/44 is OK
- 10.1109/TPDS.2021.3097283 is OK

MISSING DOIs

- 10.2307/j.ctv6wggx8.17 may be a valid DOI for title: Ports-of-Call

INVALID DOIs

- None
editorialbot commented 2 years ago

:wave: @openjournals/joss-eics, this paper is ready to be accepted and published.

Check final proof :point_right::page_facing_up: Download article

If the paper PDF and the deposit XML files look good in https://github.com/openjournals/joss-papers/pull/3350, then you can now move forward with accepting the submission by compiling again with the command @editorialbot accept