Open Jeparre opened 8 years ago
Same issue here.
oned log :
[Z0][InM][D]: Monitoring host 192.168.1.20 (10)
[Z0][InM][I]: Command execution fail: scp -r /var/lib/one/remotes/. 192.168.1.20:/var/tmp/one
[Z0][InM][I]: /var/lib/one/remotes/./im/xen.d/collectd-client.rb: No such file or directory
[Z0][InM][I]: /var/lib/one/remotes/./im/xen.d/collectd-client_control.sh: No such file or directory
[Z0][InM][I]: /var/lib/one/remotes/./xen/prereconfigure: No such file or directory
[Z0][InM][I]: /var/lib/one/remotes/./xen/reconfigure: No such file or directory
[Z0][InM][I]: ExitCode: 1
[Z0][ONE][E]: Error monitoring Host 192.168.1.20 (10):
Seems to be a problem of files missing. Actually files do exist but there are symbolic links (collectd-client.rb, collectd-client_control.sh, prereconfigure, reconfigure). Concerning the scp command execution failure, it does work manually. Don't understand.
This pull request should solve all your problems: https://github.com/OpenNebula/addon-xen/pull/9 along with this OpenNebula fix: https://github.com/OpenNebula/one/pull/139
Hi Laurent, I just installed everything from addon-xen master and also made sure that I do not have leftovers of the old broken and partly self-fixed addon. I am still getting this error:
Wed Nov 9 10:18:23 2016 [Z0][ONE][E]: Error parsing host information: syntax error, unexpected VARIABLE, expecting EQUAL or EQUAL_EMPTY at line 1, columns 7:16. Monitoring information:
Error executing sudo /usr/sbin/xentop -fbi2
I also installed the mentionend one fix #139 (just quickly manually changed the line in vnm_mad/remotes/lib/vnmmad.rb) I am using One 5.0.2 on Ubuntu 16.04
which commit(s) of your PR is actually dealing with this error? That would help me to further debug this issue.
Thank you very much!! all the best from Vienna Jojo
I just realized that not every monitoring request fails, only about every second time !?!
Hi JOJ0,
I think I had this problem when I first tried the addon. Then I started to write the fixes and the problem went away. I find no difference between my xen drivers and the current state of the repository.
Try to run this as oneadmin on your problematic host and look at the output:
sudo /usr/sbin/xentop -fbi2
/var/tmp/one/vmm/xen/poll -t
/var/tmp/one/vmm/xen/poll
Also make sure you have this line in /etc/sudoers
oneadmin ALL=(ALL) NOPASSWD: /usr/sbin/xentop *
Best regards,
Laurent
Hi Laurent, thanks a lot for the hint. I partly did check this already. xentop does work. the error is somewhere in the poll script.
oneadmin@dell2:~$ sudo /usr/sbin/xentop -fbi2
NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS VBD_OO VBD_RD VBD_WR VBD_RSECT VBD_WSECT SSID
Domain-0 -----r 40 0.0 10229200 97.6 no limit n/a 4 0 0 0 0 0 0 0 0 0 0
NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS VBD_OO VBD_RD VBD_WR VBD_RSECT VBD_WSECT SSID
Domain-0 -----r 40 2.1 10229200 97.6 no limit n/a 4 0 0 0 0 0 0 0 0 0 0
oneadmin@dell2:~$
oneadmin@dell2:~$ /var/tmp/one/vmm/xen/poll -t
Error executing sudo /usr/sbin/xentop -fbi2
oneadmin@dell2:~$ /var/tmp/one/vmm/xen/poll
Error executing sudo /usr/sbin/xentop -fbi2
What OS and Xen version are you using on your Hypervisor? Maybe the problem is a tiny difference in the output of xentop? and that's why the parsing goes wrong?
This is Ubuntu 14.04/Xen 4.4.
$ sudo /usr/sbin/xentop -fbi2
NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS VBD_OO VBD_RD VBD_WR VBD_RSECT VBD_WSECT SSID
Domain-0 -----r 238 0.0 8388608 3.1 8388608 3.1 32 0 0 0 0 0 0 0 0 0 0
NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS VBD_OO VBD_RD VBD_WR VBD_RSECT VBD_WSECT SSID
Domain-0 -----r 238 1.3 8388608 3.1 8388608 3.1 32 0 0 0 0 0 0 0 0 0 0
Hey guys, I think that I found a workaround to this error. I don't found the error on the code, but when I comment the line: #load_vars(hypervisor, file, vars) inside the file: "/var/tmp/one/vmm/xen/poll" my host can be monitored now. I trying to know why, but maybe this can help somebody.
@Jeparre, I think you are kind of deactivating monitoring with this completely, but I am not sure.
the variable vars holds XM_POLL which holds "sudo /usr/sbin/xentop -bi2", which is the actual "monitoring command" that is executed on the hypervisor.
update: on the other hand....the exact same xentop line is stated in xenrc file which overrides XM_POLL anyway...
IMHO the error must be somewhere in "def self.get_all_vm_info", which is a complex text parsing thing that extracts info from the xentop output (variable text in line 78)
if get_all_vm_info fails it throws the rescue error we are seeing in the log (line 126):
rescue
STDERR.puts "Error executing #{CONF['XM_POLL']}"
nil
end
well, that's just my uneducated guesses and analysis, could be all wrong, but maybe it helps you deconstruct the thing further.
Jojo
@Igrawet, I never got back to you. i am also using Ubuntu 14.04/Xen 4.4. so my theory that there is a difference in the xentop output and that's why parsing fails, is very very unlikely ;-)
Hummm.. so I will try to find that error at the code. An off topic question: are you using lvm with opennebula?
Regards, Jeparre
On Nov 30, 2016 7:10 AM, "J0J0 T" notifications@github.com wrote:
Igrawet, I never got back to you. i am also using Ubuntu 14.04/Xen 4.4. so my theory that there is a difference in the xentop output and that's why parsing file, is very very unlikely.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/OpenNebula/addon-xen/issues/8#issuecomment-263821033, or mute the thread https://github.com/notifications/unsubscribe-auth/ALMY0Np_nyS07L-EcPWypkhVr4sic8Tqks5rDT2JgaJpZM4JVJON .
@jeparre no I don't but I use drbd with opennebula, which also accesses just a block device. so there is some similarity. but let's better discuss this on the forum. i suggest you just post your lvm question there
@JOJ0 I think the problem is caused by missing attributes in dom['config']. In my case I added "if !dom['config'].has_key?('disks') then next" at line 196 to temporary avoid this error message.
I'm trying to configure a new node host with xen and having some trouble.. This log appears in oned.log. Someone else already had this problem?
Error parsing host information: syntax error, unexpected VARIABLE, expecting EQUAL or EQUAL_EMPTY at line 1, columns 7:16. Monitoring information: