Closed brittonsmith closed 3 weeks ago
Hi Britton, this is looking fantastic! I have a lot of thoughts on this, but I'm a little limited on time right this moment.
I wanted to share a few initial thoughts, based on taking a glance at the outputs (I didn't look at your code yet). I'll come back later and add more.
Let me preface this by saying, I definitely don't want to bog down your efforts endless bike-shedding, especially since this is something we could change later without breaking user-facing code. Before going on, I do we think should add a top level json attribute like "format-version" in order to more easily facilitate changes.
Now, I want to pose an open-ended generic question (that we don't necessarily need to answer now): "How much do we care about specifying Grackle's precise state?"
code_units
instance used to initialize the chemistry_data_storage
struct and the instance used when solving chemistry/calculating local derived quantities will slightly alter the results.
chemistry_data
, code_units
and grackle_version
to json, but I'm not sure that will help much here).Finally, I wanted to share few other quick thoughts. A lot of this is spitballing (so I tried to label how strongly I feel about certain things -- from strong suggestion to an idea):
"input"
json object, would contain nested object containing "density"
, "x-velocity"
, etc. (Not sure if this is necessary)'grid_dx'
(could be omitted when not relevant), 'grid_rank'
, 'grid_dimension' (to make parsing easy, I would always make the last one hold 3 elements -- the valid elements are specified grid_rank
)grackle_field_data
member-names instead of the names we use in FluidContainer
(e.g. velocity_x
vs x-velocity
, e_density
vs de
, internal_energy
vs energy
, HII_density
vs HII
)
My response clearly got a little out of hand! As I mentioned above, we definitely don't need to decide on everything now (especially if we label the file-version)
Matthew, many thanks for these thoughts. With regard to your broader question, I think we should care quite a bit about knowing Grackle's state. I don't want to quantify that at the moment, but it seems a large number of the issues users encounter stem from improper units. When this code was all in Enzo, it wasn't such a dangerous proposition to initialize the chemistry with one set of units, throw them away, and then rely on the user to provide new (and likely different) but consistent ones. I'd prefer this discussion to minimally impact the rest of the work here as I think it will be easier to allow the format used here to change as this discussion evolves.
- strong suggestion: to simplify the json parsing (especially in C/C++ test code), I think we should always put the data associated with each input field inside a 1D list (even if it is only a single element).
Sure, can do.
- potential idea: put all field inside a nested group within fields. For clarity, a given
"input"
json object, would contain nested object containing"density"
,"x-velocity"
, etc. (Not sure if this is necessary)- suggestion: maybe record other information about the fields, including
'grid_dx'
(could be omitted when not relevant),'grid_rank'
, 'grid_dimension' (to make parsing easy, I would always make the last one hold 3 elements -- the valid elements are specifiedgrid_rank
)
I'm agnostic on these as I sort of expected anyone using the json file for their own purposes would have to do some by-hand additional setup as I'm not specifying the complete state, only the ones I deemed minimally necessary. If anything, I was hoping that this approach would validate or falsify the notion that what I have there are in fact the minimal inputs to supply. As well, I was wary of overcomplicating things by trying to classify the various forms of "input" so this could all be initialized in the most general way.
- strong suggestion: to support C/C++ unit-tests (and anybody who wants to manually specify one of these json files), it would probably be better to use the
grackle_field_data
member-names instead of the names we use inFluidContainer
(e.g.velocity_x
vsx-velocity
,e_density
vsde
,internal_energy
vsenergy
,HII_density
vsHII
)
You're right on about this. In fact, I think we should be moving to consistent names over all the APIs. I'd be in favor (in not necessarily in this PR) of deprecating the Pygrackle-only names and phasing them out.
- idea: rather than specifying redshift, maybe we should be specifying the current_scale_factor since that is what the underlying code uses... (unfortunately, this introduces a little ambiguity with respect to a_units). I think this is tied to the broader question I asked up above
My issue with this is I think user intuition is in redshift-space and not so much scale-factor. That said, the cooling tables themselves are evenly spaced in log(1 + z) so it may make sense to sample randomly from that to cover the data better.
- idea: Maybe we should be naming the scenarios (I don't know what to call the high-level json objects) and tests rather than putting them in lists. This could make it easier for humans to map between the stuff being loaded in code and what is found in the json file... (especially if errors occur).
I considered this, but I concluded that the number of different value combinations we might like to test would vastly exceed our naming vocabulary. As well, the values are currently totally random and don't map to a particular scenario. There could be value in testing specific named situations, though.
That's all I've got for now. I'm happy to keep iterating on any of these points, but some I wholeheartedly agree with and will implement. I'll then continue on with something similar for tests of the Python examples. This is the main thing I'd like to have in place for finalizing Gen's PR.
Hi @brittonsmith , thank you very much for this PR. Given my limited knowledge of the Grackle code I'm not in the conditions of discussing the technical details of the proposed solution but if I correctly understand the general approach for dataset generation I think it is exactly what we need for an extensive testing of the code!
@ChristopherBignamini, excellent. A high priority is to get something that works for you, so please do let me know if there are changes to the json format that can be helpful. Following Matthew's suggestions, I have made a few changes. The current version of the file can be see here: http://paste.yt-project.org/show/571/ Please, let me know how this works for you.
I meant to take a look at this, but the day got away from me. A few quick thoughts (I'm happy to expand on it).
The files are currently named using the indices of the parameter and input sets. If we change the range of parameters and inputs, these could get scrambled and would need to be regenerated. Maybe we could do something like what @mabruzzo suggested previously and give each testing set some sort of string name and save the files using that
I like that, but I'm obviously a little biased
As is clear from the test failures, there seems to be some machine dependence for the answers. I am hesitant to simply loosen the tolerances until they pass. It may be better to move to a system like in enzo-dev where the CI generates its own answers by moving to a "gold standard" commit, then moves forward to the current version for actual comparison. Doing this would require significant modification of the circeci config script, so I'd like to have a discussion here about whether this is the right way to go before I do it.
I agree this is probably the way to go. I have my hesitations, but I suspect it's probably the most rigorous solution
With that said, I'm a little optimistic that we could reuse some machinery to make this easier. In particular, if we were able to merge in #208, and took some of the answer-testing machinery from Enzo-E[^1] (namely chunks of the conftest.py
and answer_testing.py
), I think the minimum amount of circleci machinery to get this working would be relatively simple. We could add a new job that installs dependencies, clones grackle, then simply runs
cd grackle
git tag tip
# build/install the gold-standard (this freshly builds Grackle)
git checkout ${GOLD_STANDARD}
pip install .
# run the gold-standard
cd src/python
pytest tests --answer-store --local-dir=./gold-standard-answers
# cleanup
pip uninstall pygrackle
cd -
# build/install the version of grackle we want to test
git checkout tip
pip install .
# run the tests
cd src/python
pytest tests --local-dir=./gold-standard-answers
(I could be wrong about the precise arguments -- I'm happy to help with that!) Alternatively, we could pull machinery from yt. The command line interface is very similar (I modeled Enzo-E's interface after yt), but I think their implementation is a little more idiomatic for pytest. I'm also happy to take a look at all of this and help.
We don't currently do this, but we could embed the grackle configuration into the yt dataset that is saved to verify the correct parameter set is being compared against. We could also embed some sort of format version number this way and add some assert to the testing.
That certainly would be great! But, would that take much time to implement?
[^1]: I don't think we wouldn't even need to include the license since you and I are the only people who have touched that code.
@mabruzzo, this is now ready for review. Here is a summary of the changes:
setup_fluid_container
helper function to standardize python examplesNote: the grackle_data_files submodule temporarily points to my fork as I have remove the old test answer files. When this is ready to do. I propose we do the following. Once you approve the PR, I will merge grackle_data_file PR #10, then update this PR to point back to the main repo with the appropriate changeset. When the tests pass one last time, we merge.
I seem to have introduced a testing issue with the last few changes. I'll fix that now.
Update: the failing test was introduced when I added the testing of the code example results to the test suite. This happened because I now import from pygrackle within test_code_examples.py (we didn't before), but pygrackle can't be built with Grackle compiled with openmp. I've fixed the issue by simply not having circleci run pytest tests/test_code_examples.py
and have it just manually run the cxx_omp_example.
Great! I started going through it earlier today and it looks great so far! I'll try to come back to it tomorrow (but probably not before the Grackle call)
Final update: while going through issues, I realize we are in striking distance of closing off Issue #172 (and it's relevant), so I added documentation for installing development dependencies. With this PR, I believe we can also close Issue #171 and Issue #134.
@mabruzzo, I believe I've responded to all your comments now. Here's a summary of changes:
python/tests/testing_common.py
Final update: I added some documentation for setting the grackle data path. I'm happy for all of this to be redone if/when you have a new solution.
@mabruzzo, this might have been presumptuous, but I've now merged PR #10 and updated the submodule configuration here. I hadn't noticed that you approved some time ago. I'm not sure if you want to re-approve the latest changes, but if things look good you can merge.
I'm merging in. Just a heads up: I'm working on a PR, to be posted on Monday that totally updates how we handle the data directory. This is going to totally break any code relying upon the behavior of GRACKLE_DATA_DIR
that is introduced in this PR. But I don't think that will be a problem
I'm rewriting the PR description now that this is nearing a completed state.
This PR represents a significant refactor of the FluidContainer and the infrastructure for running and testing the example Python scripts. Some of this includes responses to earlier comments made here.
FluidContainer
calculate_hydrogen_number_density
and implemented a generalizedcalculate_nuclei_density
method that accepts an element namecalculate_cooling_rate
methodfinalize_data
method to return copies of data arrays with units added (asunyt_array
s)setup_fluid_container
helper function to cover all density fields and added two initial "state" options (neutral and ionized)evolve_constant_density
andevolve_freefall
to standardize the data they returnTesting and example script refactor
The original test_examples.py code allowed for minor alterations of the Python examples for testing via changing environment variables. I have replaced this with
test_models.py
, which allows one to define a grid of parameters and input variables over which to run and produce output for comparison. When the Python examples are run by the tester, it passessys.argv
arguments to denote the specific parameter and input configurations. The Python script then callsget_model_set
to get a properly configuredchemistry_data
object and a dictionary of variables that I then bring into the runtime namespace with calls toglobals()
. We don't have to do it this way. We can also just explicitly set variables (e.g.,metallicity = input_set["metallicity"]
). In doing this, I have also altered theyt_grackle.py
script to save output so we can test against that as well rather than just verifying that it runs without error.As mentioned in the previous PR description, I've also added
test_local_functions.py
to do answer testing on the local functions using a grid of parameters and randomized, reasonable inputs. This both allows us to do explicit answer testing on these and also for others to have a base of stock test answers for doing the same for development with the primary API. This will generate and test from a json file which looks like the following: http://paste.yt-project.org/show/580/Remaining to-dos
FluidContainer
andsetup_fluid_container
. I'll have a look at those.Points for discussion
This introduces a lot more answer files to store in the grackle_data_files submodule. I suppose that isn't a problem itself, but I can already see a few complications.