fgci-org / ansible-role-slurm

For installing and configuring SLURM - Simple Linux Utility for Resource Management
MIT License
35 stars 15 forks source link

Hello Ansible Galaxy #114

Closed martbhell closed 4 years ago

martbhell commented 4 years ago

Galaxy doesn't like the way we say ansible-role-pam is a dependency:

Starting import: task_id=525198, repository=CSCfi/ansible-role-slurm 

===== LOADING ROLE ===== 
Task "525198" failed: Expecting dependency name format to match "username.role_name", got ansible-role-pam 

This PR tries out a new syntax where we specify the role-pam used in fgci-ansible: https://github.com/CSCfi/fgci-ansible/blob/master/requirements.yml#L178

It does not specify a version, but it is possible to also do that: https://docs.ansible.com/ansible/latest/galaxy/user_guide.html#dependencies

Is this dependency listing in meta/main.yml even needed anymore? We haven't listed the NHC role in there but it's in the requirements.yml.

VilleS1 commented 4 years ago

In the doc https://galaxy.ansible.com/docs/contributing/creating_role.html#role-metadata it says: "When Galaxy imports a role, the import process looks for metadata found in the role’s meta/main.yml file."

Does this role depend on role-pam to function? If it does then for the ansible galaxy role-pam should be in meta/main.yml dependencies. Looks like meta/main.yml is only for galaxy but I'm not sure if it affects running playbooks locally. Maybe the behaviour depends on the -r option in: /usr/bin/ansible-galaxy install -r requirements.yml --force

martbhell commented 4 years ago

In the doc https://galaxy.ansible.com/docs/contributing/creating_role.html#role-metadata it says: "When Galaxy imports a role, the import process looks for metadata found in the role’s meta/main.yml file."

I think Galaxy here is referring to the Galaxy server. Not ansible-galaxy CLI tool.

Does this role depend on role-pam to function? If it does then for the ansible galaxy role-pam should be in meta/main.yml dependencies. Looks like meta/main.yml is only for galaxy but I'm not sure if it affects running playbooks locally. Maybe the behaviour depends on the -r option in: /usr/bin/ansible-galaxy install -r requirements.yml --force

You can probably test the effect of the meta/main.yml much easier than I can.

By looking in travis 1 where we run test.yml 2 which has

 - hosts: install,compute
   remote_user: root
   roles:
     - ansible-role-pam
     - ansible-role-nhc
     - ansible-role-slurm

If we look in the output from travis and ansible runs then yes, first ansible-role-pam is applied, then nhc, and then pam again, right before slurm. I assume this is because of the meta/main.yml. I have not verified this.

IIRC that is what we wanted to do, make sure that pam role is applied before slurm. I'm not sure right now what is the reason for this.

I'm quite sure Slurm also depends on NHC == If nhc is not available then slurm won't start. But that's not in the meta file. So maybe we are also happy to remove the pam one?

Now in fgci-ansible we just control that by listing pam and nhc roles before slurm 3. Maybe that's enough?

I hope the new syntax used here is good enough. But as I can't test it easily (I don't have a slurm cluster setup :). Like what happens during ansible-pull? Maybe it would be best to if we want to keep it in here to also set a version: ..

I tried to comment out the ansible-role-pam from requirements.yml and then run ansible-galaxy install -r requirements.yml and it did not want to install the pam role found in a role's meta file.

Found e8493a25c7f79c0eace1b3621e947e33f07d009b and b996a6279887b5de50821e5e4dfbd98833cb25a3 on this topic too..

VilleS1 commented 4 years ago

Ok, might be that the doc refers to cli tool. But looks like the meta stuff is for the galaxy.ansible.com server that hosts these roles. Isn't this why we are having this discussion in the first place? If one doesn't care a about that then scrap the meta and use the requirements.yml only.

And reapplying of roles is caused by this meta stuff if I read the docs correctly. It is the dependency tree but also affects running roles. Maybe.

martbhell commented 4 years ago

OK. I set dependency to [] so that should make Galaxy server really happy.

martbhell commented 4 years ago

Thanks!