Implement HDF5-Based MGXS Library Format

nelsonag commented 8 years ago

As discussed in #336, it would be nice to have an HDF5-based mgxs input file format. This can be done in one of two ways:

We just take the current xml format as is and port it to HDF5.
Gain some experience with different use cases of the current capability and see if format changes exist which would be more useful.

I'm open to either an am curious to hear other's opinions. One facet to consider is if this should be included with 0.8 or not. If so, that would push us to #1.

wbinventor commented 8 years ago

It would be nice to develop an HDF5 format consistent with OpenMOC, and perhaps MOCingjay (if @friedmud chooses to use HDF5). This doesn't mean that the format needs to be consistent with OpenMOC's currently accepted HDF5 format, but we would likely want to update our format there in lockstep with a format implemented here so the same files work with each code. Understandably our format in OpenMOC would probably be a subset of that for OpenMC, since the multi-group implementation here includes support for higher order scattering and angular-dependent MGXS. Anyway, I'm interested in participating in a conversation on this as you consider options @nelsonag.

nelsonag commented 8 years ago

That is a very good place to start, no reason to be different if we don't have to be. What is MOCingjay anyways?

I should also add that while I don't know if it makes sense to attach it to OpenMC directly, it would be awfully nice to have a generic MGXS converter from external formats (HELIOS, WIMS, ANISN, DDX, whatever other perturbations are common) to our own to help out with the researchers. While everyone should use MC to generate their MGXS, not all may want to :-P

On Wed, Feb 24, 2016 at 8:17 AM, Will Boyd notifications@github.com wrote:

It would be nice to develop an HDF5 format consistent with OpenMOC, and perhaps MOCingjay (if @friedmud https://github.com/friedmud chooses to use HDF5). This doesn't mean that the format needs to be consistent with OpenMOC's currently accepted HDF5 format, but we would likely want to update our format there in lockstep with a format implemented here so the same files work with each code. Understandably our format in OpenMOC would probably be a subset of that for OpenMC, since the multi-group implementation here includes support for higher order scattering and angular-dependent MGXS. Anyway, I'm interested in participating in a conversation on this as you consider options @nelsonag https://github.com/nelsonag.

— Reply to this email directly or view it on GitHub https://github.com/mit-crpg/openmc/issues/594#issuecomment-188251459.

paulromano commented 8 years ago

The latter issue sounds like something more suited for pyne.

nelsonag commented 8 years ago

Good point.

On Wed, Feb 24, 2016 at 9:15 AM, Paul Romano notifications@github.com wrote:

The latter issue sounds like something more suited for pyne https://github.com/pyne/pyne.

— Reply to this email directly or view it on GitHub https://github.com/mit-crpg/openmc/issues/594#issuecomment-188272552.

friedmud commented 8 years ago

MOCingjay is a new MOC code I'm working on. I'm not at a point yet where I want to say too much about it :-). Hopefully soon!

@wbinventor I would definitely participate in trying to hammer out a consistent HDF5 format. I do think that's what I'm going to use for MOCingjay. As always though: trying to satisfy the needs of many users can lead to a huge, nasty, specification. So we do need to weigh the pros and cons.

Have you seen the design documents for YakXS? That's the XML format Yaqi invented for Rattlesnake. It is huge, and exhaustive. Could be a good place to start. I think it's been openly published... I'll try to find a copy of it. Of course, we wouldn't use XML... but it's still a modern hierarchical XS storage plan.

nelsonag commented 8 years ago

I've never heard of YakXS and a scholar search came up with nothing obvious. Interested to see any references you have though! On Feb 24, 2016 10:23 AM, "Derek Gaston" notifications@github.com wrote:

MOCingjay is a new MOC code I'm working on. I'm not at a point yet where I want to say too much about it :-). Hopefully soon!

@wbinventor https://github.com/wbinventor I would definitely participate in trying to hammer out a consistent HDF5 format. I do think that's what I'm going to use for MOCingjay. As always though: trying to satisfy the needs of many users can lead to a huge, nasty, specification. So we do need to weigh the pros and cons.

Have you seen the design documents for YakXS? That's the XML format Yaqi invented for Rattlesnake. It is huge, and exhaustive. Could be a good place to start. I think it's been openly published... I'll try to find a copy of it. Of course, we wouldn't use XML... but it's still a modern hierarchical XS storage plan.

— Reply to this email directly or view it on GitHub https://github.com/mit-crpg/openmc/issues/594#issuecomment-188304288.

nelsonag commented 8 years ago

@wbinventor and @friedmud do you guys have any plans to support in-line self-shielding calculations? That is, do you expect your mgxs library format to include background cross sections, or subgroup data?

I ask because if not, then the current OpenMC MGXS data is very nearly a superset of all your needs. From the OpenMOC documentation it looks like you also would have a diffusion coefficient and buckling as user input. Those can be easily added, but is buckling a scalar or vector of length G?

friedmud commented 8 years ago

My current plan is to ignore self-shielding all together... just do everything by generating XS directly using OpenMC. We'll see if I get tired of waiting ;-)

My main issues with the current format are that I need libraries that have multiple material temperature and densities in them. i.e.: I need to generate XS at 5 different temperatures and 5 different moderator densities and store all of that data in a way that I can quickly interpolate in-between it.

The YakXS format has that capability in it... I am realizing now that I still haven't posted any YakXS literature. Let me see if I can get something that has already been cleared for external release...

wbinventor commented 8 years ago

@nelsonag we have no plans to use OpenMOC as a traditional lattice physics code for in-line self-shielding, but rather as a high-fidelity full-core 3D analysis tool. @PrezNattyGibbs did work on some stuff along these lines early in his PhD, but determined it best to do it separately in his own code and it didn't make it into OpenMOC, nor do we have anyone lined up to continue to this area of work.

As for the OpenMOC MGXS data, be warned that our current documentation online is for v0.1.4 released nearly a year ago and is very out-of-date. You should build our documentation locally which includes up-to-date guide on MGXS (though it is still lacking a lot of details on bells and whistles in the code). Sorry for the confusion, our core code maintainer @geogunow should be releasing v0.2 and the respective documentation soon.

In short, you are correct in that data generated with openmc.mgxs fully meets the needs of OpenMOC. We no longer need bucklings or diffusion coefficients as input as these are computed on-the-fly for CMFD. The only things we need are constants for total/transport, nu-fission, (nu-)scattering and chi.

I do think that openmc.mgxs could be relatively easily extended to deal with both temperature (of interest to @friedmud) and angle (of interest to @nelsonag). This would account for 1) input generation, 2) (multi-)statepoint data processing, and 3) interpolation methods. If anyone wants to take a stab at implementing this I'd be happy to share my thoughts as a starting point.

wbinventor commented 8 years ago

Btw @nelsonag the process to build the OpenMOC docs locally is identical to that for OpenMC - simply run make html inside of our "docs" directory (with sphinx installed).

wbinventor commented 8 years ago

Also I'll add that density could be included in an extension to openmc.mgxs as well. I see temperature and densities as being additions to openmc.mgxs.Library while angle would be an addition to openmc.mgxs.MGXS.

nelsonag commented 8 years ago

@wbinventor & @friedmud I started a gist to capture the format. Basically its the mgxs_library.rst file from our current docs directory, messed around a bit to represent what the h5 format would be. I based it off of the current mgxs.xml file, obviously, but also the OpenMOC documentation on the develop branch. By that I mean I use your naming of mgxs types where applicable.

When you get a chance can you take a look and let me know if it meets your needs?

nelsonag commented 8 years ago

@friedmud I'm clearly stil cogitating on your material temperature and density concern. For now I did not include that in the format itself because the xs name gives all the flexibility you would need to store the multiple temps/densities. One thing I'm struggling with is what the density means to the user if they're using this library for microscopic data. I suppose a micro/macro attribute could be added, and if micro, then no density dimension to sub-xsdata objects are needed, but that sounds a bit messy.

wbinventor commented 8 years ago

Thanks for sharing your Gist @nelsonag, it is looking very nice! Although I don't have a near-term use case which calls for temperature- or density-dependence, I do have some thoughts on it which I thought I'd record here. As for density-dependence, I presume @friedmud was referring to functional form of pin cell MGXS on moderator density for coupled multi-physics simulation?

Just as the MGXS class serves as a "meta tally" leveraging tally arithmetic, we may consider implementing new class(es) which serve as "meta MGXS" to implement various functional approximations (e.g., point-wise linear) between MGXS at different temperatures and densities. These "meta MGXS" could also use tally arithmetic to implement their various functional approximations - for example, point-wise linear dependence would be very simple to implement to combine the MGXS.xs_tally objects from two or more MGXS tallied at different temperatures or densities. The "meta MGXS" classes could implement API(s) to output the predicted cross section values at different temperatures / densities, output the data at various temperatures / densities to HDF5, etc. This might not be what @friedmud would need, but just thought I'd throw it out there as a starting point.

nelsonag commented 8 years ago

Thanks @wbinventor! Right now I'm thinking the interpolation shouldnt be up to the MGXS class, instead it should be on the code using the data to do it internally. In my experience codes that would want to vary temperature, density, burn-up, etc, will do it internally. On May 30, 2016 8:46 AM, "Will Boyd" notifications@github.com wrote:

Thanks for sharing your Gist @nelsonag https://github.com/nelsonag, it is looking very nice! Although I don't have a near-term use case which calls for temperature- or density-dependence, I do have some thoughts on it which I thought I'd record here. As for density-dependence, I presume @friedmud https://github.com/friedmud was referring to functional form of pin cell MGXS on moderator density for coupled multi-physics simulation?

Just as the MGXS class serves as "meta tally" leveraging tally arithmetic, we may consider implementing new class(es) which serve as "meta MGXS" to implement various functional approximations (e.g., point-wise linear) between MGXS at different temperatures and densities. These "meta MGXS" could also use tally arithmetic to implement their various functional approximations - for example, point-wise linear dependence would be very simple to implement to combine the MGXS.xs_tally objects from two or more MGXS tallied at different temperatures or densities. The "meta MGXS" classes could implement API(s) to output the predicted cross section values at different temperatures / densities, output the data at various temperatures / densities to HDF5, etc. This might not be what @friedmud https://github.com/friedmud would need, but just thought I'd throw it out there as a starting point.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mit-crpg/openmc/issues/594#issuecomment-222485188, or mute the thread https://github.com/notifications/unsubscribe/AA_TM4ZueSGMe78hoPGZx17BqjFGWukpks5qGtwlgaJpZM4HhqU4 .

friedmud commented 8 years ago

Ok - I finally got the go ahead to put out the YakXS format document. I'm attaching it here. Note that it's huge... and we probably don't want / need most of these features but it might give us some ideas...

yakxs.pdf

friedmud commented 8 years ago

Among the things I like about YakXS is the ability to specify sparse scattering matrices... that gets pretty important as group sizes get large.

nelsonag commented 8 years ago

@friedmud Do you like that feature due to the pain of writing all the zeros or the storage? I ask because its the former, then the user can always generate their hdf5 library with compression turned on. I think I prefer that route since I get the best of both worlds: I am able to see the complete matrix after its input in to the library so I can do a real quick check on if its been entered correctly, and I get the small storage footprint.

friedmud commented 8 years ago

Also because it can reduce computation.... no need to multiply by those zeros either!

But yeah: memory is the big thing. If you're going to have separate XS for each material in each pin in a full reactor with 128 or 256 energy groups... that's a LOT of zeros...

I would be interested in seeing how effective HDF5 compression is for scattering matrices though....

nelsonag commented 8 years ago

Totally agree on the computation; didnt think of that though as OpenMC's MG mode automatically takes the dense scattering matrix and converts it to sparse.

My gut tells me its easy for a compression algorithm to compress a bunch of 0s. When I get this thing implemented I will include an option in the Python API to choose some compression options. We shall see.

On Mon, Jun 20, 2016 at 5:14 PM, Derek Gaston notifications@github.com wrote:

Also because it can reduce computation.... no need to multiply by those zeros either!

But yeah: memory is the big thing. If you're going to have separate XS for each material in each pin in a full reactor with 128 or 256 energy groups... that's a LOT of zeros...

I would be interested in seeing how effective HDF5 compression is for scattering matrices though....

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mit-crpg/openmc/issues/594#issuecomment-227271580, or mute the thread https://github.com/notifications/unsubscribe/AA_TM4QXGcqxZpbUYWeDKRtA0BnAW1xvks5qNwK6gaJpZM4HhqU4 .

nelsonag commented 8 years ago

@friedmud: I just ran our MGXS example notebook #4 problem (a single 17x17 assembly with one MGXS set per material and p3 scattering), and running gzip compression at level 9 on the hdf5 MGXS library resulted in a 63% decrease in size. Thats not great but really not bad at all.

This compression performance should get better too as the library size grows too: I noticed a very small 2-grp library had an increase in size of ~100kb I guess just due to the compression table overhead (every dataset gets compressed independently...). This 63% number is from a library that, when compressed, is only 104kb, and therefore probably dominated by that overhead.

nelsonag commented 8 years ago

To be clear when I say I ran 'gzip compression at level 9' I mean that I told the HDF5 library itself to 'compress yourself, please'. I do not mean that I ran gzip on the file from the command line.

friedmud commented 8 years ago

@nelsonag interesting... but when is the decompression done? Is it done as the file is read? Because all of those zeros will still take up a lot of memory...

Eventually I really do want some sort of sparse storage for my XS... both for memory and computation.

I like the "format" that the YakXS guys came up with... it incorporates what we know about the physics: that there is a limited ability for upscatter and there is (possibly) a limit for downscatter. So, in their format you essentially specify the first column and last column of each row that will have values... and you provide a value for everything in between (keep in mind that YakXS uses "out on row" for it's scattering matrices). It's a very intuitive way to do it that maps well to the physics.

openmc-dev / openmc

Implement HDF5-Based MGXS Library Format #594