ansible / awx

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform.
Other
14k stars 3.42k forks source link

SCM Update options ignored for projects with requirements.yml #1072

Closed ITD27M01 closed 6 years ago

ITD27M01 commented 6 years ago
ISSUE TYPE
COMPONENT NAME
SUMMARY

SCM Update options have ability to select what to do during update. But for projects with requirements.yml files these options ignored because ansible-galaxy runs with --force key.

https://github.com/ansible/awx/blob/1.0.2/awx/playbooks/project_update.yml#L142

image
ENVIRONMENT
STEPS TO REPRODUCE
  1. Configure project with requirements.yml file
  2. Start Job Template based on project.
  3. Roles uploaded every time.
EXPECTED RESULTS

Roles from requirement.yml uploaded only once.

ACTUAL RESULTS

Roles from requirement.yml uploaded every time.

ADDITIONAL INFORMATION
wenottingham commented 6 years ago

This is mostly intentional... there's not a good way to ensure you're using the specified version otherwise.

ITD27M01 commented 6 years ago

@wenottingham I have confused by scm_full_checkout. I want to disable project checkout during job template run, but I cannot do it because full checkout runs every time for job runs.

I think this logic should be reverted:

https://github.com/ansible/awx/blob/1.0.2/awx/main/tasks.py#L1397

Checkout roles only if I want checkout full project but not checkout roles during job run.

The reason of disabling roles checkout is race condition during jobs run - I cannot run simultaneous jobs because some jobs can fail with "file not found" error.

wenottingham commented 6 years ago

Note that the checkout is not guaranteed to exist before job launch, so we need to do the checkout during job runs.

The issue would be the race - that's what should be fixed, not the checkout. What's the race you're seeing?

ITD27M01 commented 6 years ago

Note that the checkout is not guaranteed to exist before job launch, so we need to do the checkout during job runs.

So, what "Update on Launch" checkbox exactly mean ?

http://docs.ansible.com/ansible-tower/latest/html/userguide/projects.html#manage-playbooks-using-source-control

image

My apologies for such example :)

wenottingham commented 6 years ago

Update on launch means "update the revision to be checked out before launching the job".

ITD27M01 commented 6 years ago

@wenottingham Can you explain please what is "revision" ? Project is divided to main playbook and roles and main playbook checked out during job run only if "Update on launch" but roles checked out every time regardless of the "Update on Launch" checkbox ?

wenottingham commented 6 years ago

How AWX handles SCM is with two different types of jobs:

grahamneville commented 6 years ago

Is it not possible to have the 'Checkout' job optional for each job run and let the user decide? This would save some seconds between each job and speed things up where a checkout isn't needed each time. You select what playbook to run from a project at creation time so maybe this is possible?

We also have an issue that for each job run it goes off and performs a checkout job, if multiple different jobs run at the same time using the same project then the jobs queue due to the checkout process. This is another reason why it would be nice to have the checkout job as an option.

wenottingham commented 6 years ago

Is it not possible to have the 'Checkout' job optional for each job run and let the user decide?

This amounts to a 'break my job template' checkbox for people to check, and we're not going to add that.

grahamneville commented 6 years ago

Why does the checkout/update need to happen on every job execution? When a project is added first time it downloads from source control, at that point all the source control files exist on the filesystem. A job template is then created and points to the playbooks on the filesystem. Should we not then be able to choose if we update of the project on each job run? If we choose not to then it just uses the downloaded files from the project setup?

It's not great that multiple different jobs using the same project queue up because of the checking of any project updates when there might not be any updates to retrieve.

wenottingham commented 6 years ago

Why does the checkout/update need to happen on every job execution? When a project is added first time it downloads from source control, at that point all the source control files exist on the filesystem.

This is not guaranteed, as multiple nodes can exist in the AWX cluster, not all of which would be the node that checked the project out.

grahamneville commented 6 years ago

Could there be a process that tells the other nodes to also download the files from source control whenever a project is added or updated, then it's not up to the job template. This way we could have the option to checkout/update from source per job template?

wenottingham commented 6 years ago

Maybe. But that's a very different thing, and requires some significant changes to the underpinning of how projects work, how distributed tasks work, and what it means when nodes come and go from the cluster over time where updates happen.

grahamneville commented 6 years ago

I've opened #1129 to request it as a feature.

wenottingham commented 6 years ago

closing in favor of https://github.com/ansible/awx/issues/1129, as the behavior noted in this issue is currently intentional.

ITD27M01 commented 6 years ago

@wenottingham

I'm sorry but I have more questions.

If I run 100 jobs, how many tasks to checkout is performed ? If there 100 tasks for checkout (with --force option) нow they compete with each other on the file system?

Here is an example of job fail:

{
    "msg": "IOError: [Errno 2] No such file or directory: '/var/lib/awx/projects/_15__simple_play/roles/rsyslogd/templates/rsyslog.rules.j2'",
    "failed": true,
    "_ansible_item_result": true,
    "_ansible_no_log": false,
    "item": {
        "content": [
            {
                "item": "$ModLoad",
                "value": "imuxsock"
            },
            {
                "item": "$ModLoad",
                "value": "imklog"
            },
            {
                "item": "$ActionFileDefaultTemplate",
                "value": "RSYSLOG_TraditionalFileFormat"
            },
            {
                "item": "$IncludeConfig",
                "value": "/etc/rsyslog.d/*.conf"
            },
            {
                "item": "*.info;mail.none;authpriv.none;cron.none",
                "value": "/var/log/messages"
            },
            {
                "item": "authpriv.*",
                "value": "/var/log/secure"
            },
            {
                "item": "mail.*",
                "value": "-/var/log/maillog"
            },
            {
                "item": "cron.*",
                "value": "/var/log/cron"
            },
            {
                "item": "*.emerg",
                "value": "*"
            },
            {
                "item": "uucp,news.crit",
                "value": "/var/log/spooler"
            },
            {
                "item": "local7.*",
                "value": "/var/log/boot.log"
            }
        ],
        "file": "/etc/rsyslog.conf"
    }
}
ITD27M01 commented 6 years ago

Hi @wenottingham, do you have answer to this ?

If I run 100 jobs, how many tasks to checkout is performed ? If there 100 tasks for checkout (with --force option) нow they compete with each other on the file system?

wenottingham commented 6 years ago

Yes, the checkout happens on each run. However, the project updates should isolate from each other - there is a filesystem lock for this.

ITD27M01 commented 6 years ago

@wenottingham Is there any logic in AWX for "filesystem lock" or it is an operating system things?

wenottingham commented 6 years ago

Logic is in AWX using OS primitives... see awx/main/tasks.py:acquire_lock/release_lock

zx1986 commented 4 years ago

the checkout happens on each run.

If I run 100 jobs with requirement.yml and --force, then it will update 100 times? If each time take about 2 mins, so it will be take 200 mins? meaningless ...

robertinohio commented 4 years ago

what if the source code is off line, but you still want the job to run using local cached version vs fetching? Doesnt seem to work.