Open jkovacevic opened 5 years ago
I am willing to provide additional information regarding this problem if needed. There was one machine, which is not available at the moment, which did not get stuck during build process. All other machines (mac OS, Ubuntu) fail unfortunately.
@jkovacevic The best you can do is to isolate issue into reproducible state and most minimal form. Push it to some repo and publish on a github.
Screenshot gives me impression that root cause actually might be in applied role logic itself, rather than in ansible-container.
But for sure without POC repo with reproducible hang - it will be impossible to say smth
ISSUE TYPE
container.yml
OS / ENVIRONMENT
SUMMARY
We are using ansible playbook to create docker images for our services. For conductor we image
ansible/container-conductor-ubuntu-xenial:0.9.2
from dockerhub. Our services are created from/store/oracle/serverjre
image.We experience non-deterministic behavior while running
ansible-container build
; there is a chance that building will get stuck (this image displays point where it gets stuck: https://imgur.com/gChyVXI). Command which is executed is given below:ansible-container build --no-cache --services {{ component }}
In order to solve this problem, we have to stop execution (i.e. Ctrl+C it) and rerun again. It has approximately 40% chance to get stuck. Even if we build different services, behavior is non-deterministic.
After debugging, we realized that process gets stuck at this line of code: https://github.com/ansible/ansible-container/blob/9250b44e0810c74c19b1f3610799d052b54f9018/container/core.py#L709, therefore we suspect the problem is in race between threads.
STEPS TO REPRODUCE
Run:
ansible-container build --no-cache --services {{ component }}
within ansible playbook, until it gets stuck.EXPECTED RESULTS
Image should be built in deterministic behavior.
ACTUAL RESULTS
Image building sometimes gets stuck.