Open treydock opened 7 years ago
Apr 18 09:48:56 metrics DDNTool: - ERROR - Process ddn-scratch1b caught APIException exception.
012Traceback (most recent call last):
012 File "/usr/bin/DDNTool.py", line 155, in one_controller
012 client.run()
012 File "/usr/lib/python2.7/site-packages/DDNToolSupport/SFAClientUtils/SFAClient.py", line 260, in run
012 self._fast_poll_tasks()
012 File "/usr/lib/python2.7/site-packages/DDNToolSupport/SFAClientUtils/SFAClient.py", line 308, in _fast_poll_tasks
012 vd_stats = SFAVirtualDiskStatistics.getAll()
012 File "/usr/lib/python2.7/site-packages/ddn/sfa/core.py", line 1095, in wrapper
012 raise APIException(ex)
012APIException: 1000: SFA/MI connection error
Judging by the traceback, you're getting an exception when you call SFAVirtualDiskStatistics.getAll()
That's coming from the SFA library itself. I'm not sure what exception code 1000 means, but it looks like you're not able to contact the DDN hardware itself.
What firmware version are you running on your controllers? And is it something you just installed?
3.1.0.1 is current version. This issue occurred during scheduled reboot of our controllers. A restart of DDNTool resolved the issue.
That's interesting. DDNTool is supposed to be able to automatically reconnect after a DDN controller is rebooted, but I haven't explicitly tested that in a while. Possibly something changed in the 3.x firmware. I'll look into it.
Just a quick update: We tested this scenario with firmware version 3.0.1.5 and it looks like a bug on DDN's side. After rebooting the DDN, the APIConnect() function just hangs and never actually connects. We're going to upgrade our test system to the latest firmware and see if this still happens. If so, I'll file a bug report with DDN.
We rebooted our DDN controllers and it crashed DDNTool. The logs show the process restarting but after that there are no logs from DDNTool and metrics stopped getting collected.