rcarluccio commented 6 years ago

Hi all,

I have been exploring different inner solvers options for my relatively high subduction models. Here mumps seems to be the fastest inner solver but it requires the highest memory access, whereas multigrid seems relatively slower solver but it requires less memory access. However, even mg requires a significant amount of memory which reduces the running time available on raijin for example. Given the slow running time of the model, it doesn't provide enough outputs in the time available. I am exploring different solver options, as well as the use of different supercomputers and I tested on different numbers of cpus and given memory.

Could you please confirm me, if when specifying solver options the argument has to be written as a string? for example: solver.options.mg_pc_mg_type= "multiplicative" Would this be correct and effective?

I am also wondering, if there are any debugging tools available for users to work more closely with developers. Is there for example a valgrind debugger python friendly tool available that we could use to track down eventual problems?

Thank you very much, Roberta

rbeucher commented 6 years ago

Hi Roberta,

Can you tell us a bit more about resolution, number of particles per elements... etc... What kind of runtime are you expecting? As for MUMPS vs MG. I tend to use MUMPS for 2D pbs and I stick with MG in 3D. I have a few 3D Rift Models running on Magnus at the moment and there are running just fine (so far).

rcarluccio commented 6 years ago

Hi Romain, thank you for your feedback. I’m running 4000x4000x1000km model on 256x128x128 elements model. Id be glad if I could ensure a resolution of at the least one element every 10k. I believe my model is robust. I would expect ~12h running time for 460 timesteps outputs over 256cpus requesting 256 GB of memory. On raijin I obtain only 40 timesteps over 24 hours requesting the same amount of resources. If I increase the number of CPUs for the same amount of memory. The model is terminated because it exceed the memory available for node (not total, which I also find interesting ). This using multigrid you can imagine how this would become even more difficult using mumps.

I’m now gonna run this model on Magnus to verify if it’s a memory shared problem due the server or something else. I’ve also already reduced the maximum and minimum viscosity cut-off, which is now 2 was orders of magnitude.

Would you have any suggestions on other tools available to help with the debugging?

Thank you everyone in advance for your help!

Robers

Sent from my iPhone

On 14 Jun 2018, at 12:10 pm, rbeucher notifications@github.com wrote:

Hi Roberta,

Can you tell us a bit more about resolution, number of particles per elements... etc... What kind of runtime are you expecting? As for MUMPS vs MG. I tend to use MUMPS for 2D pbs and I stick with MG in 3D. I have a few 3D Rift Models running on Magnus at the moment and there are running just fine (so far).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

julesghub commented 6 years ago

Hi Roberta, With regards to inspecting the stokes solver behaviour I can suggest the following. First enable petsc's internals for logging and memory statistics by using the command line options -log_view -memory_view Then in your model file force the stokes solver (I assume it's called 'solver') options to do the same with solver.options.main.help=' -log_view ascii:stokes.log -memory_view' here i send the logging information to the file stokes.log.

Due to the design of uw's solver api the petsc options aren't straightforward to enable, hence the 2 stages.

Try this stuff on something serial like docs/test/solver_1.py to get a understanding for it before HPC runs.

rcarluccio commented 6 years ago

Hi Julian,

thank you that sounds useful. I'll try that out and I'll let you know how I go with it.

What do you mean with the petsc options are not straightforward to be enabled? Do you reckon that some won't be effective even if called on the routine? However, I can check this out with some tests in serial.

julesghub commented 6 years ago

What I mean is: The current implementation of uw2's solver api destroys the global petsc options dictionary every time we call solve(). This breaks the utility of petsc's cmd args and the only work around is the one above which is awkward to use. In future we need to address this because petsc's cmd args are powerful e.g. inspection of solver behaviour.

rcarluccio commented 6 years ago

Thank you for explaining it. I can monitor the solver and the memory usage and plot down some statistics. Perhaps, we can see where to go from there.

rbeucher commented 6 years ago

I agree @julesghub , that would be great to have that possibility... However, most of the pb we encounter are due to errors in the set-up. @rcarluccio model is rather simple...May I suggest giving a try with UWGeo? Setting up a Model should be quick, the implementation has been tested so we should see whether or not there is a pb.

rcarluccio commented 6 years ago

Hi Julian,

how can I change the SNES or Mat solver options? it seems that AttributeError: 'OptionsGroup' object has no attribute 'snes'.

thank you

lmoresi commented 6 years ago

We don’t have a SNES implementation.

A big limitation, but one that we are not able to rectify in the short term.

L

On 20 Jun 2018, at 5:47 pm, Roberta notifications@github.com<mailto:notifications@github.com> wrote:

Hi Julian,

how can I change the SNES or Mat solver options? it seems that AttributeError: 'OptionsGroup' object has no attribute 'snes'.

thank you

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/underworldcode/underworld2/issues/302#issuecomment-398656693, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMAXo1nIGAj3oetno-nzdw-0n9Asdo-9ks5t-f4CgaJpZM4UnLmh.

rcarluccio commented 6 years ago

Ok. Is there a way to set an initial multiplier? or to create an initial viscosity term to try and "level out" the viscosity gradient?

On Wed, Jun 20, 2018 at 5:50 PM, Louis Moresi notifications@github.com wrote:

We don’t have a SNES implementation.

A big limitation, but one that we are not able to rectify in the short term.

L

On 20 Jun 2018, at 5:47 pm, Roberta <notifications@github.com<mailto: notifications@github.com>> wrote:

Hi Julian,

how can I change the SNES or Mat solver options? it seems that AttributeError: 'OptionsGroup' object has no attribute 'snes'.

thank you

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/ underworldcode/underworld2/issues/302#issuecomment-398656693, or mute the threadhttps://github.com/notifications/unsubscribe- auth/AMAXo1nIGAj3oetno-nzdw-0n9Asdo-9ks5t-f4CgaJpZM4UnLmh.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/underworldcode/underworld2/issues/302#issuecomment-398657508, or mute the thread https://github.com/notifications/unsubscribe-auth/AVjz3ST22bWG7MVSltUw3bokd4ekP4a2ks5t-f7GgaJpZM4UnLmh .

-- Roberta Carluccio PhD student, Geophysics and Geodynamics,

Phone: +61 415899563, School of Earth Science, University of Melbourne, Room 302, McCoy Building, 253-283 Elgin St, Carlton VIC 3053, AU.

lmoresi commented 6 years ago

What do you mean by an initial multiplier ?

The viscosity term - you can do this by loading viscosity fields as you see fit. The solution could then be used as an intial guess to the solver with an updated viscosity term. I think Taras Gerya does this kind of thing - maybe he even talks about it in his book. The good thing about the python interface is that this is all under the control of the user (sorry John !!)

On 20 Jun 2018, at 6:03 pm, Roberta notifications@github.com<mailto:notifications@github.com> wrote:

Ok. Is there a way to set an initial multiplier? or to create an initial viscosity term to try and "level out" the viscosity gradient?

julesghub commented 6 years ago

@rcarluccio, LM is right that we don't have a SNES implementation to control the non linear behaviour.

We can control the linearised stokes flow solves with KSP solver options (not Mat). What kind of KSP options are you looking for? Use -help in the options string I mentioned above https://github.com/underworldcode/underworld2/issues/302#issuecomment-397257803 to see the available options.

rcarluccio commented 6 years ago

I will incorporate such term in my script and see if the performances are improved.

I had a little play around with the KSP options and being a curios user I was wondering about further possibilities, such as SNES.

I did use the -help function and when I run with multigrid among the various options in the log file it is written the option: -matptap_via Algorithmic approach (choose one of) scalable nonscalable hypre (MatPtAP). Petsc solver uses MatPtAP, which does local RAP to reduce communication and accelerate computation. So I was just wondering how and if this was modifiable. However, it seems that the default is 'nonscalable' for small - medium size matrices, and automatically switch to 'scalable' when matrix size gets larger. so I am not going to worry about it, but I am looking forward to hearing more about scalability next week.

rcarluccio commented 6 years ago

Hi all,

I have been seeing a user warning in the log file of some of my models that happens for relative medium to high resolution and from low to medium viscosity contrast models (For newtonian rheology models). It may be nothing, but what would be the best way to check for this?

Linear solver (MGFZH7HU__system-execute), solution time 2.212890e+01 (secs) /group/m18/underworld/underworld2_test/underworld/systems/_bsscr.py:471: UserWarning: A floating-point operation error has been detected during the solve. The resultant solution fields are most likely erroneous, check them thoroughly. This is likely due to large number variations in the linear algrebra or fragile solver configurations. Consider rescaling the fn_viscosity or fn_bodyforce inputs to avoid this problem. This warning can be supressed with the argument 'fpwarning=False'. warnings.warn(estring)

Linear solver (MGFZH7HU__system-execute)

This is the function found in the BSSCR source code:

check if fp error was detected and 'reduce' result to proc 0

    lres, gres = np.zeros(1), np.zeros(1)

    lres[:] = uw.libUnderworld.Underworld.Underworld_fetestexcept()
    comm = MPI.COMM_WORLD
    comm.Allreduce(lres, gres, op=MPI.SUM)

    if gres[0] > 0 and fpwarning:
        import warnings
        estring = "A floating-point operation error has been detected during the solve.\n" + \
        "The resultant solution fields are most likely erroneous, check them thoroughly.\n"+ \
        "This is likely due to large number variations in the linear algrebra or fragile solver configurations.\n"+ \
        "Consider rescaling the fn_viscosity or fn_bodyforce inputs to avoid this problem.\n"+ \
        "This warning can be supressed with the argument 'fpwarning=False'."
        if uw.rank() == 0:
            warnings.warn(estring)
    return

rcarluccio commented 6 years ago

correction: The usual warning happens independently to the viscosity contrast used in the model. it happens for a resolutions greater than 128X32X32(e.g. 256x64x64).

rbeucher commented 6 years ago

I think the message you get is pretty self explanatory. You need to check your scaling. It is very likely that some of your solution is complete garbage... :-/ go through the range of values you are using and the way you do your scaling.

julesghub commented 6 years ago

Romain's correct, I would change the scaling of the model and run it again. Typically this message occurs when the magnitude of the RHS (the force vector) doesn't balance well with the magnitude of coefficient matrix (viscosity).

jmansour commented 5 years ago

Closing due to inactivity. Reopen if necessary.

underworldcode / underworld2

How to reduce memory usage and running time in 3D models. #302

check if fp error was detected and 'reduce' result to proc 0