Exawind / nalu-wind

Solver for wind farm simulations targeting exascale computational platforms
https://exawind.github.io/nalu-wind/
Other
124 stars 85 forks source link

Revert "Converted to the new STK simple_fields workflow" #1234

Closed jrood-nrel closed 9 months ago

jrood-nrel commented 9 months ago

Reverts Exawind/nalu-wind#1233

This PR was causing Nalu-Wind to segfault when using the Intel compiler.

djglaze commented 9 months ago

Hi @jrood-nrel,

Could you share some more details about the seg-faults you were seeing? I'm unable to trigger any misbehavior with the intel-2021.1.2 compiler here at Sandia, with either the unit or regression tests. What compiler did you use, what machine was it on, and which test(s) were seg-faulting? Thanks!

jrood-nrel commented 9 months ago

This was on Eagle. It's a case Ganesh is running.

MPT: #1  0x00002b53f0858c96 in mpi_sgi_system (
MPT: #2  MPI_SGI_stacktraceback (
MPT:     header=header@entry=0x7ffefbf08990 "MPT ERROR: Rank 242(g:242) received signal SIGSEGV(11).\n\tProcess ID: 72481, Host: r3i2n11, Program: /lustre/filesystem/scratch/user/spack-manager/spack/opt/spack/linux-rhel7-skylake_avx512/intel-20.0.2/"...) at sig.c:340
MPT: #3  0x00002b53f0858e8f in first_arriver_handler (signo=signo@entry=11, 
MPT:     stack_trace_sem=stack_trace_sem@entry=0x2b53fedc0080) at sig.c:489
MPT: #4  0x00002b53f0859123 in slave_sig_handler (signo=11, 
MPT:     siginfo=<optimized out>, extra=<optimized out>) at sig.c:565
MPT: #5  <signal handler called>
MPT: #6  0x00002b53e38c2314 in sierra::nalu::max_extent(stk::mesh::FieldBase const&, unsigned int) ()
MPT:    from /lustre/filesystem/scratch/user/spack-manager/spack/opt/spack/linux-rhel7-skylake_avx512/intel-20.0.2/nalu-wind-master-33lj2rbocawi6fvq2lqcqugtdhpzwbrf/lib/libnalu.so
MPT: #7  0x00002b53e37d439e in sierra::nalu::NodalGradAlgDriver<stk::mesh::Field<double, void, void, void, void, void, void, void> >::post_work() ()
MPT:    from /lustre/filesystem/scratch/user/spack-manager/spack/opt/spack/linux-rhel7-skylake_avx512/intel-20.0.2/nalu-wind-master-33lj2rbocawi6fvq2lqcqugtdhpzwbrf/lib/libnalu.so
MPT: #8  0x00002b53e2670ba7 in sierra::nalu::WallDistEquationSystem::solve_and_update() ()
MPT:    from /lustre/filesystem/scratch/user/spack-manager/spack/opt/spack/linux-rhel7-skylake_avx512/intel-20.0.2/nalu-wind-master-33lj2rbocawi6fvq2lqcqugtdhpzwbrf/lib/libnalu.so
MPT: #9  0x00002b53e223c980 in sierra::nalu::EquationSystems::initial_work() ()
MPT:    from /lustre/filesystem/scratch/user/spack-manager/spack/opt/spack/linux-rhel7-skylake_avx512/intel-20.0.2/nalu-wind-master-33lj2rbocawi6fvq2lqcqugtdhpzwbrf/lib/libnalu.so
MPT: #10 0x00002b53e25ef833 in sierra::nalu::TimeIntegrator::prepare_for_time_integration() ()
MPT:    from /lustre/filesystem/scratch/user/spack-manager/spack/opt/spack/linux-rhel7-skylake_avx512/intel-20.0.2/nalu-wind-master-33lj2rbocawi6fvq2lqcqugtdhpzwbrf/lib/libnalu.so
MPT: #11 0x00002b53e25efce5 in sierra::nalu::TimeIntegrator::integrate_realm() ()
MPT:    from /lustre/filesystem/scratch/user/spack-manager/spack/opt/spack/linux-rhel7-skylake_avx512/intel-20.0.2/nalu-wind-master-33lj2rbocawi6fvq2lqcqugtdhpzwbrf/lib/libnalu.so
MPT: #12 0x0000000000415121 in main ()
Simulations:
- name: sim1
  optimizer: opt1
  time_integrator: ti_1
Time_Integrators:
- StandardTimeIntegrator:
    name: ti_1
    realms:
    - realm_1
    second_order_accuracy: true
    start_time: 0
    termination_step_count: 100
    time_step: 0.0002667
    time_step_count: 0
    time_stepping_type: fixed
realms:
- boundary_conditions:
  - target_name: wing
    wall_boundary_condition: bc_wing
    wall_user_data:
      turbulent_ke: 0.0
      use_wall_function: false
      velocity:
      - 0
      - 0
      - 0
  - target_name: wing-pp
    wall_boundary_condition: bc_wing_pp
    wall_user_data:
      turbulent_ke: 0.0
      use_wall_function: false
      velocity:
      - 0
      - 0
      - 0
  - inflow_boundary_condition: bc_inflow
    inflow_user_data:
      specific_dissipation_rate: 919.3455
      turbulent_ke: 0.0010422
      velocity:
      - 75.0
      - 0.0
      - 0.0
    target_name: inlet
  - open_boundary_condition: bc_open
    open_user_data:
      pressure: 0.0
      specific_dissipation_rate: 919.3455
      turbulent_ke: 0.0010422
      velocity:
      - 0.0
      - 0.0
      - 0.0
    target_name: outlet
  - periodic_boundary_condition: bc_front_back
    periodic_user_data:
      search_tolerance: 0.0001
    target_name:
    - front
    - back
  check_for_missing_bcs: true
  equation_systems:
    max_iterations: 4
    name: theEqSys
    solver_system_specification:
      velocity: solve_mom
      turbulent_ke: solve_scalar
      specific_dissipation_rate: solve_scalar
      pressure: solve_elliptic
      ndtw: solve_elliptic
    systems:
    - WallDistance:
        convergence_tolerance: 1.0e-08
        max_iterations: 1
        name: myNDTW
    - LowMachEOM:
        convergence_tolerance: 1.0e-08
        max_iterations: 1
        name: myLowMach
    - ShearStressTransport:
        convergence_tolerance: 1.0e-08
        max_iterations: 1
        name: mySST
  initial_conditions:
  - constant: ic_1
    target_name: fluid-hex
    value:
      pressure: 0
      specific_dissipation_rate: 919.3455
      turbulent_ke: 0.0010422
      velocity:
      - 75.0
      - 0.0
      - 0.0
  material_properties:
    specifications:
    - name: density
      type: constant
      value: 1.2
    - name: viscosity
      type: constant
      value: 9.0e-06
    target_name: fluid-hex
  mesh: mesh/ffa_w3_500_525_288_121_32.exo
  #automatic_decomposition_type: rcb
  #rebalance_mesh: yes
  #stk_rebalance_method: parmetis
  #use_edges: yes
  #check_jacobians: true
  name: realm_1
  output:
    output_data_base_name: results/ffa_w3_500_32_sst.e
    output_frequency: 100
    output_node_set: false
    output_variables:
    - velocity
    - density
    - pressure
    - pressure_force
    - viscous_force
    - tau_wall_vector
    - tau_wall
    - turbulent_ke
    - specific_dissipation_rate
    - minimum_distance_to_wall
    - sst_f_one_blending
    - turbulent_viscosity
    - element_courant
    - q_criterion
    - vorticity
    - assembled_area_force_moment
  restart:
    restart_data_base_name: restart/sst_ffa_w3_500_32.rst
    restart_frequency: 500
  solution_options:
    name: myOptions
    options:
    - hybrid_factor:
        specific_dissipation_rate: 1.0
        turbulent_ke: 1.0
        velocity: 1.0
    - alpha_upw:
        specific_dissipation_rate: 1.0
        turbulent_ke: 1.0
        velocity: 1.0
    - upw_factor:
        specific_dissipation_rate: 0.0
        turbulent_ke: 0.0
        velocity: 1.0
    - limiter:
        pressure: true
        velocity: true
    - noc_correction:
        pressure: true
    - projected_nodal_gradient:
        ndtw: element
        pressure: element
        specific_dissipation_rate: element
        turbulent_ke: element
        velocity: element
    - relaxation_factor:
        pressure: 0.3
        specific_dissipation_rate: 0.7
        turbulent_ke: 0.7
        velocity: 0.7
    - turbulence_model_constants:
        SDRWallFactor: 0.625
    projected_timescale_type: momentum_diag_inv
    turbulence_model: sst
  use_edges: true
linear_solvers:
- dump_hypre_matrix_stats: false
  hypre_cfg_file: hypre_file.yaml
  hypre_cfg_node: hypre_simple_precon
  kspace: 100
  max_iterations: 100
  method: hypre_gmres
  name: solve_mom
  output_level: 0
  preconditioner: boomerAMG
  recompute_preconditioner_frequency: 1
  reuse_linear_system: true
  segregated_solver: true
  simple_hypre_matrix_assemble: true
  tolerance: 1e-5
  type: hypre
  write_matrix_files: false
- dump_hypre_matrix_stats: false
  hypre_cfg_file: hypre_file.yaml
  hypre_cfg_node: hypre_simple_precon
  kspace: 100
  max_iterations: 100
  method: hypre_gmres
  name: solve_scalar
  preconditioner: boomerAMG
  recompute_preconditioner_frequency: 1
  reuse_linear_system: true
  simple_hypre_matrix_assemble: true
  tolerance: 1e-5
  type: hypre
  write_matrix_files: false
- dump_hypre_matrix_stats: false
  hypre_cfg_file: hypre_file.yaml
  hypre_cfg_node: hypre_elliptic
  kspace: 40
  max_iterations: 100
  method: hypre_gmres
  name: solve_elliptic
  preconditioner: boomerAMG
  recompute_preconditioner_frequency: 1
  reuse_linear_system: true
  simple_hypre_matrix_assemble: true
  tolerance: 1e-5
  type: hypre
  write_matrix_files: false
djglaze commented 9 months ago

@jrood-nrel Thanks for the information! The stack trace was juuuust enough to figure it out. I messed up a Field name when configuring a ScalarNodalGradAlgDriver instance inside the WallDistEquationSystem, so it ended up referencing a null Field when retrieving some sizing information for updating either a periodic BC field or an overset mesh Field after the nodal grad calculation was complete. This mistake should be independent of compiler and we have 17 regression tests that exercise this equation system. I find it a bit troubling that my local GCC build ran just fine with precisely zero diffs.

Either way, I'll wait to reintroduce this simple_fields update until after @psakievich is finished with updating nalu-wind with his smart fields and field manager changes. This patch should be significantly smaller once that work is done.