mesoscale / mmsp

The Mesoscale Microstructure Simulation Project
Other
65 stars 34 forks source link

Issue with large grids? #15

Closed djdcrist closed 7 years ago

djdcrist commented 8 years ago

An issue that I've been having recently involves reading microstructural data from a file and then using that to create an MMSP grid representation of the microstructure. The code that I've developed to do this is fairly simple and works well. After the data is imported, I use a function that I developed which does a linear expansion of the grid in whatever dimension you want by an arbitrary, integral value, eg by making a new grid with a length in the chosen dimension that is x many times larger than the original grid, and has a dx in that dimension which is 1/x the original. This function also produces the results as I would expect, with one exception: it doesn't work for any values larger than x=4.

The initialization runs just fine and the output file is written with what appears to be the correct data judging by the visual representation of the data that is made by converting the MMSP binary file to a vti file. However, upon attempting to run it, the simulation quits with a Signal 6 termination code before even going through one step. The simulations are being run on the AMOS BG/Q system.

tkphd commented 8 years ago

When the code aborts (Signal 6 = SIGABRT), does it print any messages indicating why it has done so? Is there a failed assert() statement, perhaps?

If the code isn't telling you why it aborted, you'll have to trace the error. If you compile the program with debugging hooks and without optimization ( -O0 or O1 ), upon termination each MPI rank will write a memory dump to disk. Use the addr2line utility to convert a memory address to a line number in the source code, see e.g. https://computing.llnl.gov/tutorials/bgq/#addr2line.

tkphd commented 8 years ago

On AMOS particularly, overwriting an existing file with less data does not truncate the old data, which then trips up whatever utility or simulation tries to read the file back in. For that reason, the MPI_File_open operation specifies MPI::MODE_EXCL, so the command succeeds if and only if the file does not already exist. Otherwise, the program exits, possibly without giving a reason.

Bottom line: make sure any file your program is attempting to write does not already exist. Let me know if this is the cause.

tkphd commented 8 years ago

If you have a chance, try applying PR #16. You can do so from your MMSP directory using the following commands (blatantly stolen from the PR merge instructions).

From your project repository, check out a new branch and test the changes.

git checkout -b tkphd-issue15-Issue_with_large_grids master
git pull https://github.com/tkphd/mmsp.git issue15-Issue_with_large_grids

The code should be updated -- try compiling locally, then on AMOS, and see if the code runs any cleaner. If not, it should at least give a better idea why it's failing. If it works, please comment on #16 to that effect.

tkphd commented 7 years ago

Closing due to inactivity. Issue cannot be reproduced outside the user's code.