NVIDIA / ansible-role-nvidia-driver

BSD 3-Clause "New" or "Revised" License
115 stars 66 forks source link

configure persistenced service to turn on persistence mode fails in Ubuntu 18.04 #66

Open gabrik opened 2 years ago

gabrik commented 2 years ago

First, thanks for this very useful role.

I'm having an issue when trying to install the drivers on an Ubuntu 18.04 VM with PCIe passthrough (on plain KVM).

I ran the playbook like this:

ansible-playbook -i vm-inventory.yml -e host=test -e nvidia_driver_ubuntu_install_from_cuda_repo=yes -e nvidia_driver_persistence_mode_on=no nvidia-driver.yml

and everything seems fine, except that it still runs the configure persistenced service to turn on persistence task than then fails

TASK [nvidia.nvidia_driver : configure persistenced service to turn on persistence mode] ***************************************************************************************************************************************************************************************
fatal: [test]: FAILED! => {"changed": false, "checksum": "d9db4b73d83eca98ac2179a02847f1ab0a2cb866", "msg": "Source /root/.ansible/tmp/ansible-tmp-1661936608.0951219-393830-90866238334705/source not found"}

Below the verbose output:


TASK [nvidia.nvidia_driver : configure persistenced service to turn on persistence mode] ***************************************************************************************************************************************************************************************
task path: /home/ato/.ansible/roles/nvidia.nvidia_driver/tasks/main.yml:26
<test> CONNECT TO qemu+ssh://ato@majinbu/system
<test> FIND DOMAIN test
<test> ESTABLISH community.libvirt.libvirt_qemu CONNECTION
<test> EXEC /bin/sh -c 'echo ~ && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "echo ~ && sleep 0"]}}
<test> GA return: {'return': {'pid': 31557}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31557}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'out-data': 'fgo=', 'exited': True}}
<test> GA stdout: ~
<test> GA stderr:
<test> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo ~/.ansible/tmp `"&& mkdir "` echo ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430 `" && echo ansible-tmp-1661936792.938617-393878-3538192051430="` echo ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430 `" ) && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "( umask 77 && mkdir -p \"` echo ~/.ansible/tmp `\"&& mkdir \"` echo ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430 `\" && echo ansible-tmp-1661936792.938617-393878-3538192051430=\"` echo ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430 `\" ) && sleep 0"]}}
<test> GA return: {'return': {'pid': 31559}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31559}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'out-data': 'YW5zaWJsZS10bXAtMTY2MTkzNjc5Mi45Mzg2MTctMzkzODc4LTM1MzgxOTIwNTE0MzA9fi8uYW5zaWJsZS90bXAvYW5zaWJsZS10bXAtMTY2MTkzNjc5Mi45Mzg2MTctMzkzODc4LTM1MzgxOTIwNTE0MzAK', 'exited': True}}
<test> GA stdout: ansible-tmp-1661936792.938617-393878-3538192051430=~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430
<test> GA stderr:
Using module file /home/ato/.local/lib/python3.8/site-packages/ansible/modules/stat.py
<test> PUT /home/ato/.ansible/tmp/ansible-local-3938357gq_ahvb/tmpx3vmg1rs TO ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py
<test> GA send: {"execute": "guest-file-open", "arguments": {"path": "~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py", "mode": "wb+"}}
<test> GA return: {'return': 1088}
<test> GA send: {"execute": "guest-file-close", "arguments": {"handle": 1088}}
<test> GA return: {'return': {}}
<test> EXEC /bin/sh -c 'chmod u+x '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/'"'"' '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "chmod u+x '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/' '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31567}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31567}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'exited': True}}
<test> GA stdout:
<test> GA stderr:
<test> EXEC /bin/sh -c '/usr/bin/python3.6 '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "/usr/bin/python3.6 '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31570}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31570}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'out-data': 'CnsiY2hhbmdlZCI6IGZhbHNlLCAic3RhdCI6IHsiZXhpc3RzIjogZmFsc2V9LCAiaW52b2NhdGlvbiI6IHsibW9kdWxlX2FyZ3MiOiB7InBhdGgiOiAiL2V0Yy9zeXN0ZW1kL3N5c3RlbS9udmlkaWEtcGVyc2lzdGVuY2VkLnNlcnZpY2UuZC9vdmVycmlkZS5jb25mIiwgImZvbGxvdyI6IGZhbHNlLCAiZ2V0X2NoZWNrc3VtIjogdHJ1ZSwgImNoZWNrc3VtX2FsZ29yaXRobSI6ICJzaGExIiwgImdldF9tZDUiOiBmYWxzZSwgImdldF9taW1lIjogdHJ1ZSwgImdldF9hdHRyaWJ1dGVzIjogdHJ1ZX19fQo=', 'exited': True}}
<test> GA stdout:
{"changed": false, "stat": {"exists": false}, "invocation": {"module_args": {"path": "/etc/systemd/system/nvidia-persistenced.service.d/override.conf", "follow": false, "get_checksum": true, "checksum_algorithm": "sha1", "get_md5": false, "get_mime": true, "get_attributes": true}}}
<test> GA stderr:
<test> PUT /home/ato/.ansible/roles/nvidia.nvidia_driver/files/nvidia-persistenced-override.conf TO ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source
<test> GA send: {"execute": "guest-file-open", "arguments": {"path": "~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source", "mode": "wb+"}}
<test> GA return: {'return': 1089}
<test> GA send: {"execute": "guest-file-close", "arguments": {"handle": 1089}}
<test> GA return: {'return': {}}
<test> EXEC /bin/sh -c 'chmod u+x '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/'"'"' '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "chmod u+x '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/' '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31574}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31574}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'exited': True}}
<test> GA stdout:
<test> GA stderr:
Using module file /home/ato/.local/lib/python3.8/site-packages/ansible/modules/copy.py
<test> PUT /home/ato/.ansible/tmp/ansible-local-3938357gq_ahvb/tmp9ikfkhqz TO ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py
<test> GA send: {"execute": "guest-file-open", "arguments": {"path": "~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py", "mode": "wb+"}}
<test> GA return: {'return': 1090}
<test> GA send: {"execute": "guest-file-close", "arguments": {"handle": 1090}}
<test> GA return: {'return': {}}
<test> EXEC /bin/sh -c 'chmod u+x '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/'"'"' '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "chmod u+x '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/' '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31577}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31577}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'exited': True}}
<test> GA stdout:
<test> GA stderr:
<test> EXEC /bin/sh -c '/usr/bin/python3.6 '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "/usr/bin/python3.6 '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31580}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31580}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 1, 'out-data': 'CnsiZmFpbGVkIjogdHJ1ZSwgIm1zZyI6ICJTb3VyY2UgL3Jvb3QvLmFuc2libGUvdG1wL2Fuc2libGUtdG1wLTE2NjE5MzY3OTIuOTM4NjE3LTM5Mzg3OC0zNTM4MTkyMDUxNDMwL3NvdXJjZSBub3QgZm91bmQiLCAiaW52b2NhdGlvbiI6IHsibW9kdWxlX2FyZ3MiOiB7InNyYyI6ICIvcm9vdC8uYW5zaWJsZS90bXAvYW5zaWJsZS10bXAtMTY2MTkzNjc5Mi45Mzg2MTctMzkzODc4LTM1MzgxOTIwNTE0MzAvc291cmNlIiwgImRlc3QiOiAiL2V0Yy9zeXN0ZW1kL3N5c3RlbS9udmlkaWEtcGVyc2lzdGVuY2VkLnNlcnZpY2UuZC9vdmVycmlkZS5jb25mIiwgIl9vcmlnaW5hbF9iYXNlbmFtZSI6ICJudmlkaWEtcGVyc2lzdGVuY2VkLW92ZXJyaWRlLmNvbmYiLCAiZm9sbG93IjogZmFsc2UsICJjaGVja3N1bSI6ICJkOWRiNGI3M2Q4M2VjYTk4YWMyMTc5YTAyODQ3ZjFhYjBhMmNiODY2IiwgImJhY2t1cCI6IGZhbHNlLCAiZm9yY2UiOiB0cnVlLCAidW5zYWZlX3dyaXRlcyI6IGZhbHNlLCAiY29udGVudCI6IG51bGwsICJ2YWxpZGF0ZSI6IG51bGwsICJkaXJlY3RvcnlfbW9kZSI6IG51bGwsICJyZW1vdGVfc3JjIjogbnVsbCwgImxvY2FsX2ZvbGxvdyI6IG51bGwsICJtb2RlIjogbnVsbCwgIm93bmVyIjogbnVsbCwgImdyb3VwIjogbnVsbCwgInNldXNlciI6IG51bGwsICJzZXJvbGUiOiBudWxsLCAic2VsZXZlbCI6IG51bGwsICJzZXR5cGUiOiBudWxsLCAiYXR0cmlidXRlcyI6IG51bGx9fX0K', 'exited': True}}
<test> GA stdout:
{"failed": true, "msg": "Source /root/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source not found", "invocation": {"module_args": {"src": "/root/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source", "dest": "/etc/systemd/system/nvidia-persistenced.service.d/override.conf", "_original_basename": "nvidia-persistenced-override.conf", "follow": false, "checksum": "d9db4b73d83eca98ac2179a02847f1ab0a2cb866", "backup": false, "force": true, "unsafe_writes": false, "content": null, "validate": null, "directory_mode": null, "remote_src": null, "local_follow": null, "mode": null, "owner": null, "group": null, "seuser": null, "serole": null, "selevel": null, "setype": null, "attributes": null}}}
<test> GA stderr:
<test> EXEC /bin/sh -c 'rm -f -r '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/'"'"' > /dev/null 2>&1 && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "rm -f -r '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/' > /dev/null 2>&1 && sleep 0"]}}
<test> GA return: {'return': {'pid': 31583}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31583}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'exited': True}}
<test> GA stdout:
<test> GA stderr:
fatal: [test]: FAILED! => {
    "changed": false,
    "checksum": "d9db4b73d83eca98ac2179a02847f1ab0a2cb866",
    "diff": [],
    "invocation": {
        "module_args": {
            "_original_basename": "nvidia-persistenced-override.conf",
            "attributes": null,
            "backup": false,
            "checksum": "d9db4b73d83eca98ac2179a02847f1ab0a2cb866",
            "content": null,
            "dest": "/etc/systemd/system/nvidia-persistenced.service.d/override.conf",
            "directory_mode": null,
            "follow": false,
            "force": true,
            "group": null,
            "local_follow": null,
            "mode": null,
            "owner": null,
            "remote_src": null,
            "selevel": null,
            "serole": null,
            "setype": null,
            "seuser": null,
            "src": "/root/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source",
            "unsafe_writes": false,
            "validate": null
        }
    },
    "msg": "Source /root/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source not found"
}

Any suggestion?

gabrik commented 2 years ago

Running directly on the VM seems to work, strange.