lanl / spiner

Performance portable routines for generic, tabulated, multi-dimensional data
https://lanl.github.io/spiner
BSD 3-Clause "New" or "Revised" License
7 stars 3 forks source link

Port potential Databox enhancements into Spiner from Singe -- transformations #95

Open BrendanKKrueger opened 2 months ago

BrendanKKrueger commented 2 months ago

PR Summary

Proposing changes to Spiner based on a wrapper that was written in Singe, in order to provide:

PR Checklist

BrendanKKrueger commented 2 months ago

@dholladay00 and @Yurlungur, these are some features from Singe that Jonah and I discussed transferring over to Spiner to make them more widely accessible. My main question at the moment is whether you want these features to (a) be added directly to Databox, (b) be added to a wrapper around or extension to Databox, or (c) kept in Singe.

Yurlungur commented 2 months ago

Broadly I think I'm supportive of these features moving into spiner... but looking at the header file you added for this MR I wonder how much work it would be because I notice that your version pretty explicitly talks about, e.g., density and temperature, which I don't think we want to do with spiner. It should be more generic.

BrendanKKrueger commented 2 months ago

Oh, absolutely it should be (and will be) more generic. The first commit was just copying over the Singe file to show a starting point, but there's plenty in that file that's specific to Singe and that will need to be cleaned up to be more appropriate for Spiner.

BrendanKKrueger commented 1 month ago

A few implementation question that came up as I started adapting RegularGrid1D to have transformations:

BrendanKKrueger commented 1 month ago

Another discussion we need to decide on: The existence and behavior of operator() is a problem.

BrendanKKrueger commented 1 month ago

If someone can check saveHDF and loadHDF, that would be helpful -- I made very simple, naive changes.

BrendanKKrueger commented 1 month ago

Related to the operator() / accessor discussion: Should be set_data_value (or whatever we name it) be const? I made it const because there's already operator() const that returns mutable reference. This implies that when a DataBox is const, the metadata of the DataBox is const, but the dependent variable data is not constant.

Yurlungur commented 1 month ago

Instead, I've opted for a "safer" option: write new methods get_data_value and set_data_value (I'm not attached to these names), which are classic accessors to the dependent variable values, which allow the DataBox to handle the transformations correctly.

This makes complete sense to me. I'm in favor.

Regarding operator()

I feel pretty strongly that operator needs to stay as databox sometimes plays the role of a multiD array, not an interpolator. And that's what some of the internal metadata. That said, I think I'm okay with your compromise solution, so long as we have a public accessor for the underlying pointer, which I think we do.

Yurlungur commented 1 month ago

Related to the operator() / accessor discussion: Should be set_data_value (or whatever we name it) be const? I made it const because there's already operator() const that returns mutable reference. This implies that when a DataBox is const, the metadata of the DataBox is const, but the dependent variable data is not constant.

I think that's right

BrendanKKrueger commented 1 month ago

The head of main isn't formatted correctly. I ran the clang-format, but now I'm rolling back changes to lines I didn't mess with. I'll leave it to y'all if you want to run clang-format completely or not.

Yurlungur commented 1 month ago

The head of main isn't formatted correctly. I ran the clang-format, but now I'm rolling back changes to lines I didn't mess with. I'll leave it to y'all if you want to run clang-format completely or not.

Let's just submit a separate MR to main that formats the code in one sweep.

BrendanKKrueger commented 4 weeks ago

should I add the NQT logs and friends now in this PR? Or save them for a later implementation?

Whatever makes you happy. I just didn't feel like getting into adding the NQT stuff as a dependency, because dependency management is often a tricky question and usually best left to a core team member. I'm fine if you add commits to this MR or if you add another MR that depends on this one.

Yurlungur commented 2 weeks ago

should I add the NQT logs and friends now in this PR? Or save them for a later implementation?

Whatever makes you happy. I just didn't feel like getting into adding the NQT stuff as a dependency, because dependency management is often a tricky question and usually best left to a core team member. I'm fine if you add commits to this MR or if you add another MR that depends on this one.

I'll add them in a later MR... Very busy at the moment so that will keep things moving along.

BrendanKKrueger commented 1 week ago

Earlier I stated

Do we consider x (the untransformed variable) or u (the transformed variable) to be the "ground truth"? Transformations may not always be perfectly symmetric, which creates the possibility of small gaps appearing. And because those gaps are at critical points (the endpoints of the domain), you could hit them more often than you might think.

Thinking about this more, I think I should clarify my thinking and get your thoughts. For additional clarity, let's define some notation:

Option 1: x is the "ground truth"

Option 2: u is the "ground truth"

The current implementation is option 2 (u is the "ground truth"), but thinking about it more, I think it should be option 1 (x is the "ground truth"). If you agree with that, then I'll have to put in some logic to shift the bounds in u to ensure that the bounds in x_in are within the bounds of x_tr. It's the end of the day, so I may try to address that tomorrow.

BrendanKKrueger commented 1 week ago

I thought about this more last night, and realized that my comment from the end of the day yesterday was incorrect. I now think the following is correct:

I think that tells us:

  1. We need to treat x as the "ground truth" representation, which makes sense since the user works in x-space and u-space should be treated like an internal detail not inherently visible to the user.
  2. I need to check the code and ensure that the xlo and xhi values from the constructor are saved and used for the lower-bound and upper-bound query methods.

PS: The GitHub interface is driving me nuts, because I can't seem to make conversation threads, which means that every conversation gets interleaved with every other conversation.

BrendanKKrueger commented 1 week ago

@Yurlungur, I updated how the bounds are handled, and added some additional testing. I found that we end up with an issue in RegularGrid1D::x(const int i), because that's derived from u so you have to make some sort of decision. I added a "TODO" comment with a few possible ways to handle this and very brief comments on each. I'd be interested in your thoughts on this.