aws-samples / aws-parallelcluster-post-install-scripts

Scripts to customize AWS ParallelCluster
MIT No Attribution
21 stars 13 forks source link

REST-API fails with "slurmdbd.conf not found" #25

Open Gurgel100 opened 4 months ago

Gurgel100 commented 4 months ago

In a default Parallelcluster (3.9.1) configuration it can happen that there is no slurmdbd.conf file:

Recipe: @recipe_files::/tmp/slurm_rest_api/slurm_rest_api.rb
  * ruby_block[Create JWT key file] action run
    - execute the ruby block Create JWT key file
  * file[/var/spool/slurm.state/jwt_hs256.key] action create
    - change mode from '0644' to '0600'
    - change owner from 'root' to 'slurm'
    - change group from 'root' to 'slurm'
  * directory[/var/spool/slurm.state] action create
    - change mode from '0700' to '0755'
  * ruby_block[Add JWT configuration to slurm.conf] action run
    - execute the ruby block Add JWT configuration to slurm.conf
  * ruby_block[Add JWT configuration to slurmdbd.conf] action run

    ================================================================================
    Error executing action `run` on resource 'ruby_block[Add JWT configuration to slurmdbd.conf]'
    ================================================================================

    ArgumentError
    -------------
    File '/opt/slurm/etc/slurmdbd.conf' does not exist

    Resource Declaration:
    ---------------------
    # In /tmp/slurm_rest_api/slurm_rest_api.rb

     40: ruby_block 'Add JWT configuration to slurmdbd.conf' do
     41:   block do
     42:     file = Chef::Util::FileEdit.new("#{slurm_etc}/slurmdbd.conf")
     43:     file.insert_line_after_match(/AuthType=*/, "AuthAltParameters=jwt_key=#{key_location}")
     44:     file.insert_line_after_match(/AuthType=*/, "AuthAltTypes=auth/jwt")
     45:     file.write_file
     46:   end
     47:   not_if "grep -q auth/jwt #{slurm_etc}/slurmdbd.conf"
     48: end
     49: 

    Compiled Resource:
    ------------------
    # Declared in /tmp/slurm_rest_api/slurm_rest_api.rb:40:in `from_file'

    ruby_block("Add JWT configuration to slurmdbd.conf") do
      action [:run]
      default_guard_interpreter :default
      declared_type :ruby_block
      cookbook_name "@recipe_files"
      recipe_name "/tmp/slurm_rest_api/slurm_rest_api.rb"
      block #<Proc:0x00007fdb6d578e60 /tmp/slurm_rest_api/slurm_rest_api.rb:41>
      not_if "grep -q auth/jwt /opt/slurm/etc/slurmdbd.conf"
    end

    System Info:
    ------------
    chef_version=18.2.7
    platform=ubuntu
    platform_version=22.04
    ruby=ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]
    program_name=/usr/bin/cinc-client
    executable=/opt/cinc/bin/cinc-client
rkilpadi commented 2 months ago

Slurm accounting is a prerequisite for the Slurm REST API (this should be better documented). When Slurm accounting is enabled, the slurmdbd.conf will be generated. Added error handling in the post-install script to enforce this.