Changing element order in materials changes simulation result

shimwell commented 3 years ago

Hi all,

I've been repeating the same simulation but with reordered elements in the material.xml files and noticed that the simulation results change.

I didn't purposefully reorder the elements in the material file, just the parameter runs tend to write out the materials in a different order from time to time.

Anyway with further investigation and production of a minimal example I am wondering if this is a bug in unified cross section creation or somewhere else.

Running a simple example with just two elements (lithium and lead) one way around and then the other order gives different tally results.

lead before lithium material

<?xml version='1.0' encoding='utf-8'?>
<materials>
  <material id="1" name="PbLi">
    <density units="atom/b-cm" value="0.032720171" />
    <nuclide ao="1.1788" name="Pb204" />
    <nuclide ao="20.2922" name="Pb206" />
    <nuclide ao="18.6082" name="Pb207" />
    <nuclide ao="44.1208" name="Pb208" />
    <nuclide ao="7.9" name="Li6" />
    <nuclide ao="7.9" name="Li7" />
  </material>
</materials>

lithium before lead material

<materials>
  <material id="1" name="PbLi">
    <density units="atom/b-cm" value="0.032720171" />
    <nuclide ao="7.9" name="Li6" />
    <nuclide ao="7.9" name="Li7" />
    <nuclide ao="1.1788" name="Pb204" />
    <nuclide ao="20.2922" name="Pb206" />
    <nuclide ao="18.6082" name="Pb207" />
    <nuclide ao="44.1208" name="Pb208" />
  </material>
</materials>

Runs with lead appear in the materials.xml file first 1.684769278870781 1.6847692788707778 1.6847692788707815 1.6847692788707818 1.6847692788707809 1.6847692788707795 1.6847692788707818 1.6847692788707798 1.6847692788707829 1.6847692788707815

Runs with lithium appearing in the materials.xml file first 1.6851566043500135 1.6851566043500101 1.6851566043500101 1.685156604350011 1.6851566043500106 1.685156604350012 1.6851566043500108 1.6851566043500106 1.6851566043500095 1.6851566043500157

While the repeats of the same materials.xml, geometry.xml files are fairly constant. There appears to be a step change when the order of materials is switched in the material.xml 1.685... and 1.684

Perhaps this is related to the unordered materials discussed here #1813

I attach the openmc simulation files for reproduction and am using version 0.13.0 Git SHA1: f0c1ce8bb138767baaaad8ec19c21a6b835972be on Ubuntu 18.04

lithium_before_lead.zip

lead_before_lithium.zip

gridley commented 3 years ago

This is definitely in the collision nuclide/element sampling code. If you look there, it depends on the order they're defined in.

IMO, we should sort nuclides and elements by ZAID or something.

Also, floating point math is not associative, so just reordering nuclides should change the macroscopic cross section ever so slightly since we loop through in order and add up macroscopic XS components.

shimwell commented 3 years ago

Thanks Gridley.

I agree sorting the nuclides would mean we get the same results. Perhaps that is obscuring the route cause of the problem a bit but it would help me. Sorting by ZAID (perhaps in the Materials.export_to_xml?) sounds like a nice fix which would help get the same result each time.

Would randomizing the sampling order in the collision nuclide/element sampling each sample so that it doesn't depend on the order they are defined in also be an option? Perhaps that would slow the whole simulation down too much

makeclean commented 3 years ago

What you're proposing could either mean rebuilding the total cross section every collision, or every history, or every simulation in inverse order of cost. It should be reproducible, but what does it offer compared to the simplicity of sorting?

shimwell commented 3 years ago

I'm happy with sorting 👍 , just wondering if it is totally correct. It feels like we have a few different total cross sections resulting from floating point number arithmetic and that we are picking a total cross section for a material.

Sorting would mean the same total cross section is picked each time 👍 but I don't see this total cross section is any more correct than the others.

Hence sample from all total cross sections might help, but naturally this is too computationally demanding to recalculate all the time and the difference is very small.

makeclean commented 3 years ago

It's somewhat worrying that roundoff is accruing at such a rate though none the less. If @Shimwell was seeing oscilations in the 14th and 15th SF I would agree with @gridley 100%, but we're seeing 3rd SF differences. @Shimwell you've got previous regarding running small numbers of histories, whats the relative error in these numbers?

gridley commented 3 years ago

It's somewhat worrying that roundoff is accruing at such a rate though none the less. If @Shimwell was seeing oscilations in the 14th and 15th SF I would agree with @gridley 100%, but we're seeing 3rd SF differences. @Shimwell you've got previous regarding running small numbers of histories, whats the relative error in these numbers?

It's because once you sample a single different nuclide to collide with, the particle takes an entirely different track. So simply reordering nuclides seems to often cause the first few histories to match the original simulation, but eventually one will be such that the collision nuclide is different. This propagates through and wholly alters all subsequent histories dependent on that one.

shimwell commented 3 years ago

Redoing the simulations with std dev as requested @makeclean

I can increase the particles if that helps (currently 100 batches of 5000 particles)

lead before lithium (1.684769278870782, 0.0019266051133313604), (1.6847692788707815, 0.001926605113326704), (1.6847692788707826, 0.0019266051133371813), (1.684769278870783, 0.001926605113327868), (1.6847692788707815, 0.0019266051133255398), (1.6847692788707838, 0.001926605113326704), (1.68476927887078, 0.0019266051133336889), (1.6847692788707804, 0.001926605113319719), (1.6847692788707829, 0.0019266051133255398), (1.6847692788707824, 0.0019266051133185547)

lithium before lead [(1.6851566043500101, 0.00181998578506831), (1.6851566043500084, 0.0018199857850744717), (1.68515660435001, 0.001819985785065845), (1.6851566043500101, 0.00181998578506831), (1.6851566043500088, 0.0018199857850732391), (1.6851566043500117, 0.00181998578506831), (1.6851566043500108, 0.001819985785065845), (1.6851566043500077, 0.0018199857850646128), (1.68515660435001, 0.001819985785075704), (1.685156604350009, 0.0018199857850707745)]

Git repo with code in is here https://github.com/Shimwell/mixing_material_different_results_bug

khurrumsaleem commented 3 years ago

Isn't that something already discussed in issue 330?

https://github.com/openmc-dev/openmc/issues/330

shimwell commented 3 years ago

Yep it is the same issue, sorry for missing it. I provided a simple materials.xml file as an example but we came across this by running the same simulation repeatedly with a much larger material file. The material file changed with reruns so I guess the solution mentioned in the original issue of ordereddicts doesn't appear to result in the same materials.xml file each time and the runs come back with different tally results.

makeclean commented 3 years ago

So whilst they are different in value, they are statistically equivalent, I would expect given what @gridley said, as you increase the number of histories the difference will move more the right in SF, e.g. 100x histories should be x10 lower error, should move the SF difference on point to the right.

shimwell commented 3 years ago

yep that sounds right to me and naturally the difference can be removed when particles are increased.

However I'm still going to have to either sort or fully randomize the material order in the materials.xml file. Otherwise parameter studies will not give a smooth surface. Which makes optimization harder than it needs to be.

Would people be open to a PR that sorts the material.xml file by ZAID number?

gridley commented 3 years ago

However I'm still going to have to either sort or fully randomize the material order in the materials.xml file. Otherwise parameter studies will not give a smooth surface. Which makes optimization harder than it needs to be.

I am 99% sure you will not get a smooth surface regardless, unless you are running a problem with no scattering or something like that. Once you slightly perturb a problem, particles are most likely going to take entirely different tracks, not simple perturbations from the original ones.

Would people be open to a PR that sorts the material.xml file by ZAID number?

My personal opinion is that we should sort the nuclide list in C++ after reading. The output of the program should be independent of a physically (but not computationally) identical input. Then nobody would have to worry about calling some python API method to sort the nuclides ever again.

On the other hand, sorting them in C++ would likely require re-golding our regression tests, unless by some strange coincidence the nuclides in the tests are sorted already.

paulromano commented 3 years ago

Agree with @gridley's comment on optimization -- I don't see how you would get a smooth surface. Even if you run the same problem twice on the same machine, if it is done in parallel it's entirely possible you'll end up with two different but statically equivalent answers.

Personally, I am not in favor of doing any sorting by default unless there is a good reason for it. The only "good" reason I can think of is to improve the efficiency of sampling a nuclide for a collision (which would have to be based on some combination of the atom density of that nuclide along with its cross section), but even for that I'd need some clear evidence that it's worth the extra complexity. In the absence of sorting by default, it should be sufficient to provide a method / means by which a user can sort nuclides in materials if they so choose (Materials.sort_nuclides() or something like that).

gridley commented 3 years ago

@paulromano there is a benefit to sorting nuclides for GPU operations. Imagine you have two fuel mixtures, one with nuclides listed in reverse order compared to the other. Now consider you're in the fuel XS lookup kernel, and all even-numbered particles are in fuel mix 1 and the others in fuel mix 2. You are likely to use whatever's in the cache better if there is a chance the whole thread block is looking at the same nuclide. Moreover, with sorted nuclides, you could sync threads after each nuclide. I have heard through the grapevine that a certain GPU MC code that's about 10 times faster than GPU SHIFT uses this technique...

shimwell commented 1 year ago

I just noticed in the openmc deplete module we sort the nuclides according to AtomNumber

https://github.com/openmc-dev/openmc/blob/1cb22075efdb90570f233a3d26151f7b763f8529/openmc/deplete/coupled_operator.py#L422-L425

openmc-dev / openmc

Changing element order in materials changes simulation result #1902