pyne / pyne

PyNE: The Nuclear Engineering Toolkit
http://pyne.io/
Other
270 stars 177 forks source link

Read/Write MCNP, ALARA Materials #78

Closed elliottbiondo closed 11 years ago

elliottbiondo commented 11 years ago

I want to add the capability of reading/writing MCNP and ALARA materials. Since I am still new to PyNE I am making my intentions public first to make sure my planned method is pythonic/PyNE-thonic (for a surprize-free pull request).

Reading in MCNP materials from MCNP input files:

I will create a function read_mcnp_materials within the material.pyx module. This function will read in an MCNP input file and output a dictionary of materials. The dictionary keys will be in the form mX_Y, where X is MCNP material number and Y is the density (see note below). The dictionary values will be pyne.Material objects. In order to create these objects, the function will search for material cards and their respective densities in the cell cards. The MCNP material name (e.g. m1, m2, m3) will be the first entry in mX_Y.attrs. The next entries in mX_Y.attrs will be all of the comments that appear above the material. If an MCNP card specifies materials by atom fraction, read_mcnp_material will use the from_atom_frac method of the material class to convert accordingly. If atom densities are specified the mass density will be calculated.

Example of read_mcnp_material functionality on the following input file entry:

c some custom hydrogen composition
c from source Z
m3
     1001 0.75
     1002 0.25

(with a density of -1.0 specified in some cell card)

The read_mcnpmaterials function will create a dictionary key "m3-1.0" and a dictionary value that looks like:

Material(nucvec={1001:0.75, 1002:0.25}, density=1.0, attrs=['m3', 'some custom hydrogen composition', 'from source Z'])

Writing out materials:

Writing functions will be methods of the material class. For starters I will create write_mcnp and write_alara methods. These methods will have one argument, material name, which is optional. If the material name is not given, these methods will use the first entry in attrs. The methods will print out all subsequent attributes as comments. This means that if one calls read_mcnp_material and then write_mcnp, the input and output will be exactly the same.

Simple example work flows

To demonstrate the usefulness of this, here are a few sample work flows. Consider a MCNP input file (named "mcnp.inp" with material definitions m1, m2, m3 and densities 1.0, 2.0, and 3.0 respectively.

Example 1: Now let's say you just want to print out all the materials definitions in ALARA format.

from pyne.material import *

mats = read_mcnp_materials( "mcnp.inp")

for name, mat in mats.items() :
    mat.write_alara()

Example 2: Let's say you want to make mixtures and then print out new MCNP definitions:

from pyne.material import *

mats = read_mcnp_materials( "mcnp.inp")

#0.25 and 0.75 are mass fractions
mixt1 = MultiMaterial( {mats('m1_-1.0') : 0.25, mats('m2_-2.0') : 0.75} ).mix_by_volume()

#0.4 and 0.5 are volume fractions
mix2 = MultiMaterial( {mats('m3_-3.0') : 0.4, mats('m2_-2.0') : 0.6 ).mix_by_mass()

print mix1, mix2

Quick note on densities

I decided it would be best for read methods to support MCNP materials with multiple densities, so the dictionary keys will be formated like mX_Y. This slightly burdens the user in the case of the Example 2 work flow (it would be easier if the dictionary keys were simply mX). However I believe that the benefits of supporting more kinds of MCNP input files outweigh the costs.

If you have read this far, thanks! Please let me know if anything about my plan is not satisfactory; I am happy to hear any advice.

erelson commented 11 years ago

Hopefully I don't hijack the major discussion here too much by asking this: what approach do we/you have in mind for how cross section libraries are specified, e.g. 8016.70c 1.0? At the moment, it seems to me that we can probably get away with throwing out this information when importing an MCNP input, and let created MCNP inputs use whatever default libraries MCNP prefers. (then there's the even thornier issue of what to do with S-alpha-beta information, aka MT cards)

These are just a couple things I haven't noticed come up at all in past discussions, so I wanted to throw them out there before I forget about them again.

chrisdembia commented 11 years ago

Awesome!

@elliottbiondo : I wrote the simplesim module of opensim this summer, which is a way to generate input files for codes, right now only for MCNP, to create an OO and code-independent persistent representation of a simulation. The idea this summer was that you could also use it to read in an input file and convert it to an input file for a different code (e.g. Serpent). What you want to do would be a piece of this, but I'd say at this point you probably want to get something working and attaching yourself to the simplesim module would require a lot of thought. There is, in simplesim/cards.py, a Material card object, and so it's possible you could create a Material card, not just a dict of Material's.

Anyway, these are ideas, not instructions.

@erelson : The way I handled this in simplesim was completely general: the user specifies the exact table identifier (e.g. ".17t"). The reason I did this was that different codes use a different syntax for this table identifier, and it seemed it would be too difficult (and I didn't have the time) to manage a conversion between these identifiers for different codes.

scopatz commented 11 years ago

Hi @elliottbiondo, thanks for writing this! I think that this will make the PR process go a lot easier. I am going to respond in some depth, so I apologize if this is similarly long.

This function will read in an MCNP input file and output a dictionary of materials. The dictionary keys will be in the form mX_Y, where X is MCNP material number and Y is the density (see note below). The dictionary values will be pyne.Material objects.

My feeling is that this function should return a list of Materials so that the original order is maintained. Also, the keys are not really necessary since all of the key data should be given in the attrs.

Also as a side note when designing key conventions, if you have multiple pieces of data it is better to separate them out into a tuple rather than paste them together in string form. For example I would recommend ('mX', Y) in (str, float) form rather than 'mX_Y in str form. Furthermore, while this will work, it is generally not advisable to use floats in keys because differential changes will not have the same hash value. This is another reason why I think a list is a better idea here than a dict.

The read_mcnpmaterials function will create a dictionary key "m3-1.0" and a dictionary value that looks like:

Material(nucvec={1001:0.75, 1002:0.25}, density=1.0, attrs=['m3', 'some custom hydrogen composition', 'from source Z'])

Furthermore, the attrs really should be a dictionary (even though lists are supported). I would instead suggest a parser that produced something like

Material(nucvec={1001:0.75, 1002:0.25}, density=1.0, attrs={'name': 'm3', 'comments': ['some custom hydrogen composition', 'from source Z'], 'dens_type': 'mass'})

The major differences here are that the metadata gets stored in a more complex (but more robust way). The comment cards get read in as a list. The material tag 'm3' gets assigned to 'name'. And finally, the 'dens_type' gets added as either 'mass' or 'atom' depending on whether the density is positive or negative. 'dens_type' will probably only ever be used when writing back out to MCNP form. However, this format for reading in should allow you to exactly recreate the original MCNP text that you read in.

Additionally, you could make a wrapper method on the multimaterial class that uses the 'name' attr as the keys.

For starters I will create write_mcnp and write_alara methods. These methods will have one argument, material name, which is optional. If the material name is not given, these methods will use the first entry in attrs. The methods will print out all subsequent attributes as comments. This means that if one calls read_mcnp_material and then write_mcnp, the input and output will be exactly the same.

I agree that these should be methods on the Materials class. However, the first non-self argument should be the file name that you want to write out to, like the other write methods have. More generally, I like the idea that all other **kwargs could just get applied to the attrs. All of the other changes flow nicely from having more sophisticated attrs.

Finally, I think that the workflow that you want to set up is the right one ;) I hope that these suggestions make sense.

scopatz commented 11 years ago

@erelson

Hopefully I don't hijack the major discussion here too much by asking this: what approach do we/you have in mind for how cross section libraries are specified, e.g. 8016.70c 1.0? At the moment, it seems to me that we can probably get away with throwing out

Or barring throwing this out, it should go into the attrs.

scopatz commented 11 years ago

I wrote the simplesim module of opensim this summer, which is a way to generate input files for codes, right now only for MCNP

@elliottbiondo @fitze That is right, you may want to steal, copy, move the simple simplesim code for handling MCNP materials.

paulromano commented 11 years ago

Just one very minor thing to keep in mind, if you plan on supporting S(a,b) tables. It is possible that a mt# card contains multiple S(a,b) tables. In this case, the code will try to match the ZAID of the S(a,b) table (specified in the ACE data itself) with the ZAID of a nuclide in the material. Some examples of this would be U and O in UO2 or Be and O in BeO.

elliottbiondo commented 11 years ago

Thank you all very much for the input. Here is my modified plan for going about doing this.

General changes to structure

I will go with the changes that Anthony suggested, among them: having read_mcnp_materials return a list and having attrs be a dictionary. The only thing I do not like about this are the ramifications to the work flow "example 2" from my original post. If read_mcnp_materials returns a list, then mixtures must be made by specifying materials in terms of the list index (whereas in my method materials could be specified by their slightly more user friendly dictionary key). The user will have to be sure they know what list index corresponds to what materials, which may be burdensome if multi-density materials are present. I do not think that "example 2" will be a popular work flow anyhow (I will not get into the alternatives now).

Incorporation with simplesim

It looks like the simplesim materials card class has a material class attribute. I think for now I will just write code using material class because I do not think this next level of abstraction is necessary for the code I intend to write. From what I have read in the simplesim documentation, it doesn't look like there is a "front end" that reads input files yet. The read_mcnp_materials function should be able to serve as a front end with little modification. I think we should cross this bridge when we come to it though.

Handling of mt cards

The mt card information could easily be read and stored as entries in the attrs dictionaries. This would probably just involve a few more parsing functions. I am not particularly familiar with mt cards so I think I will address this at a later time.

Cross section specification

If cross sections are specified for materials, I will read them in and treat them in a similar way to simplesim ("tables" attribute): one attrs dictionary entry will be a dictionary of isotopes and their respective cross section identifiers.

Thanks again

I will not close this issue until the simplesim and mt card considerations have been revisited.

scopatz commented 11 years ago

I will go with the changes that Anthony suggested, among them: having read_mcnp_materials return a list and having attrs be a dictionary. The only thing I do not like about this are the ramifications to the work flow "example 2" from my original post. If read_mcnp_materials returns a list, then mixtures must be made by specifying materials in terms of the list index (whereas in my method materials could be specified by their slightly more user friendly dictionary key). The user will have to be sure they know what list index corresponds to what materials, which may be burdensome if multi-density materials are present. I do not think that "example 2" will be a popular work flow anyhow (I will not get into the alternatives now).

To be clear, I think that this function should be used under the covers by a MultiMaterial method. Thus if the user wanted the key-based access they would use MultiMaterial. However I think that the keys here should just be the name attr (and not name + density). For example, here is what the MultiMaterial method would looks like:

class MultiMaterial(collections.MutableMapping):
    ...

    @classmethod
    def read_mcnp(self, fname):
         for mat in read_mcnp_materials(fname):
             self[mat.attrs['name']] = mat

    ...

Note that this should probably be a class method...

Incorporation with simplesim

Ultimately, I think that the simplesim stuff should use what is on the Material class and not the other way around. However, nothing with simplesim should be a barrier to closing this issue ;).

chrisdembia commented 11 years ago

I agree.

elliottbiondo commented 11 years ago

Thus if the user wanted the key-based access they would use MultiMaterial.

Ah, I see what you mean. I agree that that is definitely the best way of doing it. Thanks!

gonuke commented 11 years ago

Some additional comments on the material name + density tuple. Here in Wisc, we are often faced with the reuse of material compositions but with different densities. That said, I'm not sure we need to have an internal representation with unique entries for each density. We should think about an internal representation that allows for compositions to be reused with different densities. Each material might have a nominal density, but this could be overridden in some fashion in a different data structure that points to the composition that may be shared with other materials.

elliottbiondo commented 11 years ago

Each material might have a nominal density, but this could be overridden in some fashion in a different data structure that points to the composition that may be shared with other materials.

I am not sure how different of a data structure you are referring to, but I think I have come up with a solution that maintains a high fidelity to the current Material class. Here it goes:

*\ Note: this is all in addition to everything discussed before:

The density attribute of the Material class should be a dictionary. This allows users to have any arbitrary number of densities assigned to the same material, with whatever key names they choose. For example, let's say you have densities for a material at different temperatures:

my_material.density = {'300K' : 1.0, '500K' : 1.1, '1000K' : 1.15} 

Of course, this would mean that the user would have to provide these names. For the read_mcnp_materials function, if an MCNP material has multiple densities, the densities will just be named 1, 2, 3, etc. as default (see final note at bottom).

If materials have multiple densities, the user will have to specify which density they need for mixing by volume, which means that mix_by_volume method of the MultiMaterial class needs a density argument for each constituent. I can change the mix_by_volume method of MultiMaterial class so that every material constituent has not only mass/volume fraction, but also an associated density name, in this format:

mix1 = MultiMaterial( { material1 : [0.75, '1000K'],
                        material2 : [0.15, '1000K'],
                        material3 : [0.10, '1000K'] } )

If a material only has one density, then the dictionary value can just be a float and not a tuple. The mix_by_volume method will handle this too:

mix1 = MultiMaterial( { material1 : 0.75,
                        material2 : 0.15,
                        material3 : 0.10, } )

A final note:

I think it would be good for PyNE code to be amenable to work flows with some notion of a materials library; a standalone file containing material definitions that can easily be read into PyNE and manipulated according (mix materials, print definitions in different formats, etc.). It looks like the from_text and write text methods can be used for exactly this, but I may need to expand them to allow for multiple densities and all of the other information that may be added to attrs.

In the case when read_mcnp_materials is used to read in materials of different densities, the materials could be easily written out to text and the density names (which are assigned default values of "1," "2," "3," etc.) can be changed manually before reading the library back in, as desired.

scopatz commented 11 years ago

@elliottbiondo Sorry about the tardy reply! I blame my poor internet.

Each material might have a nominal density, but this could be overridden in some fashion in a different data structure that points to the composition that may be shared with other materials.

I am not sure how different of a data structure you are referring to, but I think I have come up with a solution that maintains a high fidelity to the current Material class. Here it goes:

I think that this just is the MultiMaterial.

The density attribute of the Material class should be a dictionary.

I disagree. I think that this adds a lot of overhead and would slow down the material class. Materials are supposed to be quasi-immutable (like numpy arrays). That is to say that when a modification is too big, a separate object should be returned whose values are mostly copied from the original.

All hope is not lost! I think that the right tool to accomplish multiple densities is the MultiMaterial. Just have keys which have the appropriate temperature values and values which are materials with the corresponding densities. Now if you want to accomplish the sorts of things that you are talking about with mixing by volume, etc, you can extend the MutliMaterial class to not only be Material-valued but also MultiMaterial-valued! By allowing this nesting you will be able to do everything that you need without making the very deep changes to the Material class that you are suggested.

As a development strategy, I wouldn't worry about implementing this nesting first thing. I would simply do what you proposed first and then refactor later. Yes, this might be slightly more work, but this process really helps to clarify how these different class interact.

I think it would be good for PyNE code to be amenable to work flows with some notion of a materials library; a standalone file containing material definitions that can easily be read into PyNE and manipulated according (mix materials, print definitions in different formats, etc.). It looks like the from_text and write text methods can be used for exactly this, but I may need to expand them to allow for multiple densities and all of the other information that may be added to attrs.

Thanks to @jdangerx we now have a fledgling materials library in nuc_data_make. You can grab data out of this relatively easily with the from_hdf5() method. What we don't have (yet!) is a query engine on top of this (which would be a fun project for someone). Still, I suspect that given what I have said above, that the MultiMaterial class will have its own read and write methods which wrap multiple calls to the underlying Material methods, perhaps adding some custom metadata along the way(?).

In any event, I hope this all makes sense and I am soooooooper sorry about my slow response time.

elliottbiondo commented 11 years ago

@scopatz no worries, I have been busy myself.

I think that the right tool to accomplish multiple densities is the MultiMaterial. Just have keys which have the appropriate temperature values and values which are materials with the corresponding densities.

I think we have different ideas on what MultiMaterial should do. The way it is currently implimented is:

mixture_1 = MultiMaterial( {mat1: 0.2, mat2:0.3, mat3: 0.5})

If I am not mistaken, you are suggesting that MultiMaterial should also support:

multi_density_material_1 = MultiMaterial( {'300K' : mat1_rho1, '400K' : mat1_rho2})

Although both of these example objects are collections of materials, to me it seems they are too different to be of the same class. Calling methods designed for "mixture_1" would not work for "multi_density_material_1" and vice-versa. Perhaps the current MultIMaterial class should be renamed "Mixture," and new MultiMaterial class could be created for things like multiple density materials, different material definitions, entire input cards, etc?

I think I will start implementing and PRIng some of the less controversial features we have discussed, as there is a growing need for them.

scopatz commented 11 years ago

Oh Whoops! My bad... Then to support multiple temps you would do something like:

mixture_1 = MultiMaterial( {mat1: None, mat2: None, mat3: None})
mat1.attrs['temp'] = 300.0
mat2.attrs['temp'] = 400.0

The actual weight values don't matter so I set them to None, though you could set them to 1, or -1.0, np.nan or any other number. (At the danger of thinking outload, maybe multimaterials could optionally also be initialized with sets so you don't have to put in all of those Nones. Being a set would signal that only a single mat should be selected, while being a dict would signal that it is a mixture.)

I think I will start implementing and PRIng some of the less controversial features we have discussed, as there is a growing need for them.

I think that is a great idea. You have more than enough to get started ;)

elliottbiondo commented 11 years ago

It looks like a lot of methods like write_text, from_text, write_hdf5 are written in C++ with Cython wrappers, even though they are not particularly computationally intensive. Is this something I should strive to do with write_mcnp, write_alara, read_mcnp, et cetera?

scopatz commented 11 years ago

@elliottbiondo, the reason these functions are written in C++ is so that they may easily be called from other C++ code, not because they are particularly expensive or complex. In fact the hdf5 functions would have been _much_easier to write in Python/Cython ;).

If I were you, I would write them in Python/Cython first. Then only if you or someone else actually needs them on the C++ level will we promote/demote them to the C++ level. In general, it is much easy to develop and test in Python-mode. So at the very least, if you start here the effort isn't wasted when translating to C++ because all of the hard work of algorithm development is already done. That said, if someone (cough @gonuke) sees a use case for them being in C++ to begin with you are slightly better off starting there.

tl;dr start in Cython/Python and move to C++ later if needed.

scopatz commented 11 years ago

I am declaring victory here and moving on.