openmc-dev / openmc

OpenMC Monte Carlo Code
https://docs.openmc.org
Other
745 stars 481 forks source link

Add methods to load and clear Python API tally data #533

Open samuelshaner opened 8 years ago

samuelshaner commented 8 years ago

In the Python API, data is only loaded when the sum or sum_sq is queried. If someone wants to manipulate the tally mean or std_dev directly within the tally object (e.g. during tally arithmetic), the sum and sum_sq must be first queried to load the tally data, which can be awkward.

It would be nice if load_tally_data(...) and clear_tally_data(...) methods were added to the tally class and the tally data was loaded within the statepoint.get_tally(...) method.

Credit to @wbinventor for raising this issue in PR #525.

wbinventor commented 8 years ago

Thanks for opening this issue @samuelshaner. Before one of us goes about this, we should discuss whether we want to retain the on-the-fly tally data loading functionality that is currently in the code. It appeared from your second paragraph that you suggest loading data when a tally is retrieved from the statepoint. I would suggest a compromise solution wherein we add methods to load data using a parameter indicating which metric to load (i.e., 'sum', 'mean', etc.) but we call it on the fly in the property getters. On Dec 23, 2015 8:40 AM, "sam" notifications@github.com wrote:

In the Python API, data is only loaded when the sum or sum_sq is queried. If someone wants to manipulate the tally mean or std_dev directly within the tally object (e.g. during tally arithmetic), the sum and sum_sq must be first queried to load the tally data, which can be awkward.

It would be nice if load_tally_data(...) and clear_tally_data(...) methods were added to the tally class and the tally data was loaded within the statepoint.get_tally(...) method.

Credit to @wbinventor https://github.com/wbinventor for raising this issue in PR #525 https://github.com/mit-crpg/openmc/pull/525.

— Reply to this email directly or view it on GitHub https://github.com/mit-crpg/openmc/issues/533.

samuelshaner commented 8 years ago

I'm fine with having the load and clear methods called in the property getters. Would you recommend then changing the __deepcopy__(...) method to call the attributes by their property getters instead of using direct access? In the hybrid_product(...) method each of the tallies are deep copied. If one tally is used in tally arithmetic multiple times (such as flux in the case below), the data would end up being loaded from the statepoint file multiple times if __deepcopy__(...) uses direct access. If __deepcopy__(...) uses the property getter, the data would only be loaded from statepoint once.

flux = sp.get_tally('flux')
abs = sp.get_tally('abs')
fis = sp.get_tally('fis')

abs_xs = abs / flux
fis_xs = fis / flux

I don't have a strong opinion, but would probably opt for __deepcopy__(...) to continue using direct access. I just wanted to bring this up as a potential performance concern if reading the data happens to be a bottleneck.

wbinventor commented 8 years ago

@samuelshaner - why would you probably opt to continue using direct access in __deepcopy__(...)? It seems to me that you made some good arguments as to why we should use the property getters instead as a runtime optimization to reduce the number of times we load HDF5 data from disk.

samuelshaner commented 8 years ago

My concern with using the property getters is that performing a deep copy would change the state of the tally (i.e. it would read in the data) and these base tallies would not get garbage collected until the ipython notebook is terminated or the tally goes out of scope in the python script. If the base tallies use a lot of memory tallies, but the derived tallies created by tally arithmetic or slicing are relatively small, carrying around all the memory from the base tallies might reduce performance.

I was just following the philosophy of "only read in the data when you need it". Since the deep copy doesn't need the data, I am leaning more towards using direct access in deep copy. That said, I don't have a strong preference either way and, as you can see, my thoughts are more speculation on what would be best for most use cases, so if we think using property getters would be the better solution, I'm on board with that.

wbinventor commented 8 years ago

Yeah, I think you make some good points here - now I am torn :-) It seems like we are debating whether to make a compute vs. memory optimization the default behavior. Without knowing more about which is actually more important in real scenarios, I would agree with you about using direct access to minimize the memory footprint.

samuelshaner commented 8 years ago

Okay, it sounds like we are in agreement that direct access is the way to go, at least for the time being. Do you want to make a PR for this issue or should I? I'm fine either way.

wbinventor commented 8 years ago

Do we need a separate issue for this, or can this issue suffice? I suspect that a single PR to address these items will not require very much code and can be combined into one since they are related anyway.

On Wed, Dec 30, 2015 at 3:37 PM, sam notifications@github.com wrote:

Okay, it sounds like we are in agreement that direct access is the way to go, at least for the time being. Do you want to make a PR for this issue or should I? I'm fine either way.

— Reply to this email directly or view it on GitHub https://github.com/mit-crpg/openmc/issues/533#issuecomment-168069868.

Will Boyd Nuclear Science & Engineering Massachusetts Institute of Technology Cell: 423.413.8469 wbinventor@gmail.com

samuelshaner commented 8 years ago

I think this issue can suffice. Based on our discussion, I don't think the deep copy method needs to be modified, so the PR should just have to add the two new methods and modify some other methods in tallies.py as needed.

wbinventor commented 8 years ago

Agreed

On Wed, Dec 30, 2015 at 3:46 PM, sam notifications@github.com wrote:

I think this issue can suffice. Based on our discussion, I don't think the deep copy method needs to be modified, so the PR should just have to add the two new methods and modify some other methods in tallies.py as needed.

— Reply to this email directly or view it on GitHub https://github.com/mit-crpg/openmc/issues/533#issuecomment-168070879.

Will Boyd Nuclear Science & Engineering Massachusetts Institute of Technology Cell: 423.413.8469 wbinventor@gmail.com