Closed writemike closed 5 years ago
@writemike
None of the modules have been tested on 14.x, looking at the error output:
"utilCmdArgs\\\\\": \\\\\"-c \\\\\\\\\\\\\"tmsh -c \\\\\\\\\\\\\\\\\\\\\\\\\\\\\"save sys ucs /var/tmp/tmp_test.ucs
Can you check in the audit logs, what is the exact command being run, cause I do not think having this:
utilCmdArgs: "-c "tmsh -c save sys ucs"
is what it should look like.
I guess I could look at maybe having an option to just SAVE the ucs file/not download added to bigip_ucs_fetch module, as we have endpoints to save/load UCS in the REST api. I need to think about it.
We saw the same result on the customer's BIG-IP running 12.1.4. 14.1 is just what they have running in their lab for repro. Here are the logs: Audit log:
Feb 7 16:03:45 bigip-14-1.lab.local notice tmsh[13672]: 01420002:5: AUDIT - pid=13672 user=admin folder=/Common module=(tmos)# status=[Command OK] cmd_data=save sys ucs /var/tmp/tmp_test.ucs
Feb 7 16:03:45 bigip-14-1.lab.local notice icrd_child[28513]: 01420002:5: AUDIT - pid=28513 user=admin folder=/Common module=(tmos)# status=[Command OK] cmd_data=run util bash -c "tmsh -c \"save sys ucs /var/tmp/tmp_test.ucs\""
restjavad.0.log:
[WARNING][1356][06 Feb 2019 03:04:24 UTC][8100/mgmt ForwarderPassThroughWorker] URI:http://localhost:8100/mgmt/tm/util/bash/, Referrer:10.200.1.1, Method:POST, Exception:java.util.concurrent.TimeoutException: remoteSender:10.200.1.1, method:POST
at com.f5.rest.common.RestWorker.logAndFailExpiredOperation(RestWorker.java:3113)
at com.f5.rest.common.RestWorker.checkForExpiredOperations(RestWorker.java:3102)
at com.f5.rest.common.RestServer.checkAndExpirePendingWorkerOperations(RestServer.java:1263)
at com.f5.rest.common.RestServer.access$200(RestServer.java:44)
at com.f5.rest.common.RestServer$3.run(RestServer.java:1236)
at com.f5.rest.common.ScheduleTaskManager$2$1.run(ScheduleTaskManager.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
[WARNING][1357][06 Feb 2019 03:04:24 UTC][8100/tm/auth/user PublicWorker] URI:http://localhost:8100/tm/util/bash/, Referrer:10.200.1.1, Method:POST, Exception:java.util.concurrent.TimeoutException: remoteSender:10.200.1.1, method:POST
at com.f5.rest.common.RestWorker.logAndFailExpiredOperation(RestWorker.java:3113)
at com.f5.rest.common.RestWorker.checkForExpiredOperations(RestWorker.java:3102)
at com.f5.rest.common.RestServer.checkAndExpirePendingWorkerOperations(RestServer.java:1263)
at com.f5.rest.common.RestServer.access$200(RestServer.java:44)
at com.f5.rest.common.RestServer$3.run(RestServer.java:1236)
at com.f5.rest.common.ScheduleTaskManager$2$1.run(ScheduleTaskManager.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
[SEVERE][1359][06 Feb 2019 03:04:24 UTC][ChildWrapper] Timed out waiting for POST request at http://localhost:8100/tm/util/bash/. Killing child process 31560
[I][1360][06 Feb 2019 03:04:24 UTC][ChildWrapper] Killing IcrdChild process 31560
[I][1362][06 Feb 2019 03:04:24 UTC][ProcessManager] Enabling transaction for child 12701332-d920-4fca-9d19-7a23b74486f6.
Here is the equivalent curl test. Maybe it is an iControl REST issue?
curl -X POST \
https://10.200.1.10/mgmt/tm/util/bash/ \
-H 'Content-Type: application/json' \
-H 'X-F5-Auth-Token: WDLVYSJ5OXRZPM2OXFULLJYL4R' \
-H 'cache-control: no-cache' \
-d '{
"command": "run",
"utilCmdArgs": "-c \"tmsh -c \\\"save sys ucs /var/tmp/tmp_test.ucs\\\"\""
}'
and result:
400 Bad Request
Time:70649 ms
Size:893 B
{
"code": 400,
"message": "remoteSender:10.200.1.1, method:POST ",
"originalRequestBody": "{\n \"command\": \"run\",\n \"utilCmdArgs\": \"-c \\\"tmsh -c \\\\\\\"save sys ucs /var/tmp/tmp_test.ucs\\\\\\\"\\\"\"\n}",
"referer": "10.200.1.1",
"restOperationId": 7985444,
"kind": ":resterrorresponse"
}
This is happening because of the hard coded timeout in the REST API server code.
This problem is handled for you, or should be, in the bigip_ucs_fetch module.
This cannot be fixed in the command module because it's a core bug of the rest API. Even in the ucs_fetch module, we work around it by ignoring the timeout error and ps'ing the processes on box until the UCS process disappears.
Suggest you try the bigip_ucs_fetch module
Thanks for the update! We tried the bigip_ucs_fetch module originally, but it was running really slow until we tried the fix for "bigip_ucs_fetch hangs #1152" which has now solved that problem. We will go back to using the bigip_ucs_fetch module.
ISSUE TYPE
COMPONENT NAME
bigip_command
ANSIBLE VERSION
PYTHON VERSION
BIGIP VERSION
CONFIGURATION
OS / ENVIRONMENT
Ansible is running from CentOS 7 Docker Instance and BIG-IP is a VE running in VMware Fusion Professional Version 11.0.2 (10952296).
SUMMARY
Intermittently responding with 400 Unexpected Error to "tmsh save sys ucs /var/tmp/tmp_test.ucs" when executed from bigip_command module. The UCS file is always successfully created on the BIG-IP, even when we see the error message.
STEPS TO REPRODUCE
Run playbook below.
EXPECTED RESULTS
ACTUAL RESULTS
Both tasks always complete successfully. The delete UCS task always responds without any errors, but the create UCS task sends the 400 error most of the time. The UCS file is always successfully created on the BIG-IP in /var/tmp/. Any workarounds in the task to not get the error (interval, retries, wait_for)? I have tested these optional parameters and saw no change in response.