openmc-dev / openmc

OpenMC Monte Carlo Code
https://docs.openmc.org
Other
745 stars 481 forks source link

Can't use 66c cross sections from MCNP #209

Closed paulromano closed 10 years ago

paulromano commented 11 years ago

A user reported that a segmentation fault occurs if you try to run examples/basic using 66c cross sections from the data distributed with MCNP. It appears to be related to delayed neutron energy distributions. Traceback is:

 Loading ACE cross section table: 92235.66c
forrtl: severe (408): fort: (2): Subscript #1 of the array XSS has value 722106 which is greater than the upper bound of 722105

Image              PC                Routine            Line        Source             
openmc             0000000000A9CACE  Unknown               Unknown  Unknown
openmc             0000000000A9B566  Unknown               Unknown  Unknown
openmc             0000000000A52D52  Unknown               Unknown  Unknown
openmc             0000000000A09D2B  Unknown               Unknown  Unknown
openmc             0000000000A0A241  Unknown               Unknown  Unknown
openmc             0000000000454E44  ace_mp_get_real_         1293  ace.F90
openmc             0000000000438AF1  ace_mp_get_energy         893  ace.F90
openmc             000000000041ED45  ace_mp_read_nu_da         544  ace.F90
openmc             000000000041046F  ace_mp_read_ace_t         301  ace.F90
openmc             00000000004058B9  ace_mp_read_xs_            78  ace.F90
openmc             0000000000600C8B  initialize_mp_ini         100  initialize.F90
openmc             000000000075703D  MAIN__                     15  main.F90
openmc             00000000004031FC  Unknown               Unknown  Unknown
libc.so.6          00007F2F3E9A876D  Unknown               Unknown  Unknown
openmc             00000000004030D9  Unknown               Unknown  Unknown
nelsonag commented 10 years ago

I took a quick stab at this, and identified the following:

  1. This bug still exists in the current (as of 15 minutes ago) version of the develop branch
  2. Loading in the U-235 endf66c data in pyne also failed when reading the group 6 delayed neutron energy distribution information
  3. Loading in the U-235 endf66c data in separate ACE-python interface also failed at the same spot.

I haven't been successful devising a problem in MCNP5 or 6 to throw an error when using this data (I probably need to learn about enabling problems which actually use delayed nu data vice total nu.

Given that we have at least two separate codes which also throw errors here (the independent python ACE reader, and OpenMC/Pyne), I'm very much leaning towards an issue in the endf66c data files distributed with MCNP. (I count OpenMC and Pyne as one merely because I believe Paul wrote the ACE readers for both, and therefore I cant technically count them as independent). I think it would be proof enough of bad library data if we can get the same error to show up in MCNP and/or Serpent.

I am unsure what the proper course of action would be if it was a data library bug and not an ace.F90 bug. We can close this and move on, but is this the desired behavior for how we want to fail on bad libraries? I think I am OK with this result, because the right way to do it I guess would be checking the entire ACE file ourselves; something that probably should be the data distributors job and not ours.

makeclean commented 10 years ago

I have a LOT of experience with this kind of problem, I've benchmarked MCNP, Serpent 1 & 2, OpenMC, COG, Tripoli and a few others to boot on the same nuclear data. Both OpenMC and Serpent undoubtedly have better behaviour (from my point of view) regarding the handling of nuclear data errors.

MCNP does very little checking to ensure that the data are valid and correct, indeed I have several cases where Serpent and OpenMC fail to run due to -ve cross sections or -ve probablities in URR, but MCNP happily runs without complaining.

It is undoubtedly the producer/distributor of the data to fix not for you to include exceptions to rules. I think the correct thing to do here would be exit gracefully if possible.

paulromano commented 10 years ago

@nelsonag Were you able to figure out what exactly it is in the data that is erroneous? It would be nice, as @makeclean suggests, to throw a meaningful error message at the point that bad data is reached rather than hit an out-of-bounds error.

Also, what is the (non-PyNE) Python ACE reader you referred to? To be clear, I did write the ACE module of PyNE so they absolutely are not independent. In fact, that module pre-dates OpenMC, so when I was writing OpenMC, ace.f90 was basically a Fortran port of pyne.ace.

nelsonag commented 10 years ago

I have been actually, I think I found the culprit and it turns out this one is a little bit messier than I was hoping for. So the ACE format guide (MCNP5 manual, Vol III, Appendix F) makes use of an L(I) parameter to describe the location of each incoming energy's distribution data relative to JXS(11) or JXS(19). This is used in Laws 4, 44, 61, and 67. OpenMC, PyNE and that other script I mentioned do not use the L(i) information, and instead make the implicit assumption that the law data for the next incident energy set comes immediately after the previous incident energy data set. This is a perfectly fine methodology because the ACE Format Guide itself does NOT use L(i) for anything in the data. This can be seen in Table F14.b; the line which describes where to find the data for E(1) says to go to entry K=3+2_NR+2_NE, and is not a function of L, which you think it should be. I don't have access to any older versions of the ACE Format Guide so I can't say if it was always like this or not.

Now, the ACE data release in question (at least for U-235 and U-238 in that evaluation, ENDF-B/VI.5 I think) actually DOES make use of the L(i) parameter, and uses it in a way that is inconsistent with the current format guide definition of where to find the E(1) data (K=3+2_NR+2_NE). That is, for at least these two cases L(1) = L(2). These datasets intend for the same probability distributions to be used for incoming energy 1 as are used for incoming energy 2.

OK, how to fix this? I can think of two options: 1) Put in code to say "Hey, if L(I) == L(J), quit gracefully." This is a trivial fix, allows for modern data, and is true to the format guide.
2) Rewrite the ACE reading routines and energy distribution sampling routines to use L. This requires much more of a time investment and is not true to the format guide (and thus could create more errors), nor is it consistent with LANL's own methodology used in the non-PyNE python ACE reader I mentioned.

I vote for option 1, but this is based on my assumption that the MCNP5 Users Manual is currently the official format guide for ACE. Does anyone know if that is strictly true? I have not found any information to suggest otherwise at this point.

By the way, the non-PyNE ACE reader mentioned is one given to me by LANL a few months ago which they dont consider to have gone through enough V&V to be released.

nelsonag commented 10 years ago

I should add that we do need to have an error message of some form in place for this; you only get the out-of-bounds error in certain scenarios, but not in all, even though this L(i) issue is going on. For example, OpenMC does not have an error when substituting U-238.66c for the U-235.66c, but both have L(1) == L(2) for the precursor data.

paulromano commented 10 years ago

Ahh, ok... I remember seeing that L(I) parameter and thinking it seemed kind of superfluous. This helps to explain why it's there. Like you, I would vote for option 1 as an acceptable resolution to this. Do you want to take a shot at putting in that check @nelsonag?

As far as I know, the MCNP documentation is the official ACE format guide. However, there have been a number of changes to the format recently that are not reflected in documentation anywhere yet, e.g. continuous S(a,b) data, polynomial expansion coefficients for on-the-fly broadening.

nelsonag commented 10 years ago

I'd be glad to. FYI, I checked the serpent source code; they use L(I) as a location for E(I) data. That being said, I still think we should go with option 1 for the reasons discussed above.

nelsonag commented 10 years ago

The pull request has been submitted. I wanted to add here that I ran every library in the MCNP6 default XSDIR file (after converted to a cross_sections.xml file of course). The deficiency showed up for pretty much every fissionable nuclide in 61c, 64c, 65c, 66c, 97c, 68c, and 69c. There is probably some common NJOY option and/or version used to generate these files. I did not check to see if the issue exists in non-delayed-neutron energy distributions or if they were all in delayed-neutrons.

paulromano commented 10 years ago

If the issue is really that widespread, perhaps we should get in touch with someone from LANL who is involved with the ACE format. There is a possibility that the documentation itself is incorrect and that using those pointers really is the intended behavior.

paulromano commented 10 years ago

Do we know if using the L(i) pointers would actually break any of the newer data? (My guess is no, if that's what MCNP and Serpent do)

nelsonag commented 10 years ago

Don't know for sure; probably not. You are right, it is probably best to consult with LANL regarding if this data is deprecated or not. I will send an email message tomorrow. On Dec 9, 2013 8:15 PM, "Paul Romano" notifications@github.com wrote:

Do we know if using the L(i) pointers would actually break any of the newer data? (My guess is no, if that's what MCNP and Serpent do)

— Reply to this email directly or view it on GitHubhttps://github.com/mit-crpg/openmc/issues/209#issuecomment-30191241 .

nelsonag commented 10 years ago

Just an update; I have sent the email and have been waiting for some resolution. I probably will try a few other contacts over the next few days

nelsonag commented 10 years ago

just pinged again. Now that I am nearing the end of my 'paternity-leave', I will focus more on getting to the bottom of this.

nelsonag commented 10 years ago

This issue should be fixed by PR #278. Note that I still have not been able to get any official word from LANL. I just decided to go ahead with the change since it has no practical impact on the current data.

paulromano commented 10 years ago

Closing since #278 was merged