FluidNumerics / FEOTS

BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Segmentation fault in OperatorDiagnosis #2

Closed fluidnumerics-joe closed 6 years ago

fluidnumerics-joe commented 6 years ago

When diagnosing the transport operators from the impulse response field and graph file for the new 27 pt stencil, a segmentation fault occurs in the POP_Adjacency_Graph module in the ReadGraphBin subroutine.

Fortran runtime error: Index '1650000001' of dimension 1 of array 'localioarray' above upper bound of 1650000000

Error termination. Backtrace:
#0  0x421ccd in __pop_adjacencygraph_class_MOD_readgraphbinfile_pop_adjacencygraph
    at /turquoise/users/schoonover/FEOTS/src/POP/POP_AdjacencyGraph_Class.f90:418
#1  0x436add in operatordiagnosis
    at /turquoise/users/schoonover/FEOTS/src/POP/programs/OperatorDiagnosis.f90:116
#2  0x43d5de in main
    at /turquoise/users/schoonover/FEOTS/src/POP/programs/OperatorDiagnosis.f90:40

This error has been produced with gcc/6.4.0 and occurs with both debug and production runs.

fluidnumerics-joe commented 6 years ago

In GreedyColoring, when writing the graph binary file, the number of chunks to be read is not matching with the number of chunks that are written to the graph binary file.

In the read and write routines for this binary file, the chunk size is determined by the byte-size of a local integer indexer i. In the write routine, i and the localIOArray are 8-byte integers. In the read routine, they are currently 4-byte integers. This mismatch in byte-size between the read and write routines results in a different array length and chunk size which is leading to the seg fault.

Fortunately, this does not cause any issues with the generated impulse field.

Currently testing a patch that remedies this. I will compare the impulse fields and graph files after the patch, to verify they are identical. The patch will primarily affect the ReadGraphBinFile routine.

@JiaxuZ

fluidnumerics-joe commented 6 years ago

@JiaxuZ , I've pushed up fixes that should resolve this issue, but I cannot test them on Badger due to memory limitations. Could you pull down the master branch, and test it out ?

Recompile and run GreedyColoring and OperatorDiagnosiswhen you get a chance. Re-running GreedyColoring will provide consistent graph files for the coloring algorithm, but will not modify the impulse functions used.

You'll likely need a node with ~ 160 GB RAM

JiaxuZ commented 6 years ago

The new code worked out perfectly. Thanks @schoonovernumerics But unfortunately, one thing I didn't notice is that when I was running the online POP simulation, I didn't save the variable VDC_S throughout the time. I'll need to re-run part of the POP simulation to regenerate these variables.

fluidnumerics-joe commented 6 years ago

This is great to hear ! Can we go ahead and close this issue?