arista-eosplus / ztpserver

Full service ZTP Server implementation for bootstrapping Arista EOS nodes
Other
70 stars 33 forks source link

install_extension action copies the file to the switch but doesn't install it #206

Closed alessandro-danti closed 10 years ago

alessandro-danti commented 10 years ago

As per subject - I am trying to install some rpm packages related to Puppet from the ZTP server during the ZTP bootstrap.

The packages are copied to the switch "extension" folder, but after running a "show extensions" command in EOS, they all show up as not installed.

I am sure that the RPM packages do install properly, as I have been doing it manually before using the ZTP server.

Is there any current issue related to the "install_extension" action or am I doing something wrong?

Thanks

alessandro-danti commented 10 years ago

this is a sample from my node definition file:

---
actions:
  -
    action: install_extension
    attributes:
      url: files/puppet/ruby-1.8.7.swix
    name: "Install Ruby (swix)"
    onstart: "Starting Ruby install.."
    onsuccess: "SUCCESS: Ruby installed"
    onfailure: "FAIL: Ruby not installed"
phil-dileo commented 10 years ago

Hi Alessandro, I cannot be certain why the extension is not being installed without seeing a log from the switch, but you can modify the action in the definition to force the installation.

actions:
  -
    action: install_extension
    attributes:
      url: files/puppet/ruby-1.8.7.swix
      force: true
    name: "Install Ruby (swix)"
    onstart: "Starting Ruby install.."
    onsuccess: "SUCCESS: Ruby installed"
    onfailure: "FAIL: Ruby not installed"
alessandro-danti commented 10 years ago

Hi Philip,

could you please point me to a location on the EOS filesystem where the logs for this specific action can be found? I will gladly post more information if I can find it.

Another strange behaviour that I noticed is that the "install_extension" action is not always executed when the bootstrap script is invoked (both at EOS ZTP boot or manually executed in the switch bash shell). If I create a definition file that only contains install actions, do you think it should properly execute? Or are there requirements which I wasn't able to understand?

thanks

phil-dileo commented 10 years ago

Hi Alessandro, There is some logic to determine if an action is passed to the node for execution. This can be overridden via definition attributes (note the 'always_execute'):

actions: 
  -
    action: install_extension
    attributes:
      url: files/puppet/ruby-1.8.7.swix
      force: true
    always_execute: true
    name: "Install Ruby (swix)"
    onstart: "Starting Ruby install.."
    onsuccess: "SUCCESS: Ruby installed"
    onfailure: "FAIL: Ruby not installed"

If you used packer-ztpserver to setup your ZTPServer, then you already have syslog and XMPP running on your ZTPServer. One of the first steps in the bootstrap process is for the node to GET /bootstrap/config. This returns the YAML file /usr/share/ztpserver/bootstrap/bootstrap.conf. An example file contains:

--- 
logging: 
  - 
    destination: "ztps.ztps-test.com:514"
    level: DEBUG
xmpp: 
  domain: im.ztps-test.com
  password: eosplus
  rooms: 
    - ztps
  username: bootstrap

This will cause the node to log bootstrap messages to the syslog and XMPP destinations included. Therefore, if you'd like to view both the ztps process console output and the node's bootstrap messages, try: ztps --debug & tail -f /var/log/messages

phil-dileo commented 10 years ago

Here's the logic I alluded to: http://ztpserver.readthedocs.org/en/develop/config.html?highlight=always_execute#node-specific-configuration

alessandro-danti commented 10 years ago

Thanks Philip, I will check and try again with your suggestions in mind. I'll let you know how it goes.

advornic commented 10 years ago

Hi Alessandro,

Did you manage to sort this out - do you need any more help figuring this out?

Thanks, Andrei

On Wed, Sep 17, 2014 at 1:58 PM, Alessandro Danti notifications@github.com wrote:

Thanks Philip, I will check and try again with your suggestions in mind. I'll let you know how it goes.

— Reply to this email directly or view it on GitHub https://github.com/arista-eosplus/ztpserver/issues/206#issuecomment-55889164 .

alessandro-danti commented 10 years ago

Hi, sorry for the delay, in between holidays and other tasks I am coming back at you only today.

I have tried once again to install the extensions via the ZTP server according to the suggestions that Philip wrote, but even if the ZTP server logs show that it was apparently successful, it is not - the packages are indeed copied to the extensions folder but they show up as A, NI (Available, Not Installed). I did verify through bash as well (rpm -qa | grep facter for one) and those are definitely not installed.

Here is a snippet from that specific node definition file where the facter RPM is being installed:

  -
    action: install_extension
    attributes:
      url: files/puppet/facter-2.2.0-1.el6.i386.rpm
      force: true
    always_execute: true
    name: "Install Facter (force)"
    onstart: "Starting facter install.."
    onsuccess: "SUCCESS: Facter installed"
    onfailure: "FAIL: Facter not installed"

And here is a snippet from the ZTP log that shows the apparent installation of the facter RPM:

Oct 20 10:52:38 172.16.130.102 <14>ZTPS - INFO: Executing action install_extension
Oct 20 10:52:38 172.16.130.102 <14>ZTPS - INFO: Action install_extension: Starting facter install..
Oct 20 10:52:38 172.16.130.102 <14>ZTPS - INFO: GET http://172.16.130.10:8080/files/puppet/facter-2.2.0-1.el6.i386.rpm
DEBUG: [controller:120] GET /files/puppet/facter-2.2.0-1.el6.i386.rpm HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate, compress
Content-Length: 4
Content-Type: text/html
Host: 172.16.130.10:8080
User-Agent: python-requests/1.2.3 CPython/2.7.0 Linux/2.6.38.8.Ar-1847378.4137M
Resource: puppet/facter-2.2.0-1.el6.i386.rpm

172.16.130.102 - - [20/Oct/2014 10:52:38] "GET /files/puppet/facter-2.2.0-1.el6.i386.rpm HTTP/1.1" 200 98408
Oct 20 10:52:38 ZTPS - INFO: Server response to GET request: status=200
Oct 20 10:52:38 172.16.130.102 <14>ZTPS - INFO: GET http://172.16.130.10:8080/meta/files/puppet/facter-2.2.0-1.el6.i386.rpm
172.16.130.102 - - [20/Oct/2014 10:52:38] "GET /meta/files/puppet/facter-2.2.0-1.el6.i386.rpm HTTP/1.1" 200 67
Oct 20 10:52:38 ZTPS - INFO: Server response to GET meta: contents={u'size': 98408, u'sha1': u'f23dc5f308da77df9bb9bf4eca2911a1503e25c5'}
Oct 20 10:52:38 172.16.130.102 <14>ZTPS - INFO: Writing /mnt/flash/.extensions/facter-2.2.0-1.el6.i386.rpm...
Oct 20 10:52:38 172.16.130.102 <14>ZTPS - INFO: Action executed succesfully (install_extension)
Oct 20 10:52:38 172.16.130.102 <14>ZTPS - INFO: Action install_extension: SUCCESS: Facter installed

As it shows, apparently all is well - but at the end is not. A manual install via the EOS CLI works as expected and gets the packages in the Installed state - but we would like to have the process fully automated.

Am I still missing something here? why the RPM file is copied but the installation is not triggered?

thanks. Alessandro

alessandro-danti commented 10 years ago

While trying to understand more on the process of RPM installation, I have been diving into the actual "install_extension" python script that is located under /usr/share/ztpserver/actions on the ZTP server.

I am not a Python expert, but I couldn't see anything that triggers the actual RPM package installation, as the last line reads:

open(BOOT_EXTENSIONS, 'a').write('\r\n%s' % line)

My understanding is that this Python script just copies the package to the extensions folder (which happens correctly) but doesn't do anything else to actually trigger the installation.

Is there a subsequent command/script to be executed in the definitions at bootstrap to start the package installation? I will have a look at the bootstrap script to see whether there is a set of instructions that deals with the actual package setup.

jerearista commented 10 years ago

Alessandro, The ZTP process works with the expectation of setting a system up, then reloading. The setup, which may include both configuration and software version changes to bring it in to compliance with the definition, is followed by a reload to activate new software versions and configurations. By adding the RPM to boot_extensions, the installation will be performed at reload, thus bootstrap does not attempt to immediately install the extensions.

-Jere

On Mon, Oct 20, 2014 at 6:58 AM, Alessandro Danti notifications@github.com wrote:

While trying to understand more on the process of RPM installation, I have been diving into the actual "install_extension" python script that is located under /usr/share/ztpserver/actions on the ZTP server.

I am not a Python expert, but I couldn't see anything that triggers the actual RPM package installation, as the last line reads:

open(BOOT_EXTENSIONS, 'a').write('\r\n%s' % line)

My understanding is that this Python script just copies the package to the extensions folder (which happens correctly) but doesn't do anything else to actually trigger the installation.

Is there a subsequent command/script to be executed in the definitions at bootstrap to start the package installation? I will have a look at the bootstrap script to see whether there is a set of instructions that deals with the actual package setup.

— Reply to this email directly or view it on GitHub https://github.com/arista-eosplus/ztpserver/issues/206#issuecomment-59726926 .

Jere Julian Arista Networks http://arista.com EOS+ Advanced Services Engineer Mobile: 919-250-8217 Office: 408-547-8418 jere@arista.com

advornic commented 10 years ago

Jere is right - issue #211 will enable you to actually run a command and install the extension during the bootstrap process (before a potential reboot). This is currently scheduled for v1.3 of the ZTP Server.

advornic commented 10 years ago

Does that make sense?

alessandro-danti commented 10 years ago

Jere and Andrei, many thanks for your reply. I understand what both of you mean - but the problem is that the extension doesn't get installed after the bootstrap reboot - hence my question above.

Is there any indication (file, configuration parameter) somewhere that EOS shall install the extensions after reboot? and is there a specific file I can check for some log/output about this specific task?

alessandro-danti commented 10 years ago

I am trying once again as I am writing here, I just noticed on the startup logging that it is indeed trying to install the extensions (a line says: "Installing EOS extensions") and then the subsequent line is "Invalid character found on line 2".

As I have now found this line, is there any specific log file I can check to find what is causing this error?

alessandro-danti commented 10 years ago

I went deeper in my investigation, finding out that /etc/rc.sysinit invokes a Python script that is in charge of installing the extensions.

I have replicated it manually on the bash shell and it was failing with the same error I've got during the boot sequence ("Invalid character found on line 2").

After trying to understand what could be the cause, as the extension list in the file was looking correct, I have checked if it was a DOS file and -bingo-, it was indeed like that and not correctly formatted for the loader script.

After correcting this issue with a simple tr to remove the carriage returns, I issued a new reload of the EOS and now the extensions have been installed properly.

I have then restarted the whole EOS setup process from scratch - factory reset, ZTP running and ready for bootstrap - and made sure that my "definition" file in the ZTP node folder is not a DOS file.

After the bootstrap and reboot, the same failure in the extensions loading script happened, checked and found that the extensions text file is again containing DOS characters.

What writes the file '/mnt/flash/boot-extensions'? I think that the script/function in charge of that shall be checked for carriage returns, as this appears to be the cause of the extension installation failure.

Here is what that file looks like when the extensions loader tries to parse it:

^M
ruby-1.8.7.swix^M
facter-2.2.0-1.el6.i386.rpm force

Alessandro

advornic commented 10 years ago

Hi Alessandro,

Thanks for digging into this - great work!

Indeed, this is a bug in the action code - see https://github.com/arista-eosplus/ztpserver/blob/develop/actions/install_extension. The last line should add a cage return only if the file is not empty.

Something like:

should probably do the trick. We will fix this asap in v1.2.

Thank you again for reporting this.

alessandro-danti commented 10 years ago

Thanks, I get quite stubborn when something doesn't work and I need to understand why :)

I will try changing that line in my ztp VM install_extension and see if it works until v1.2 is ready - so I can go on with my automation work in the meanwhile.

I'll keep you posted. Alessandro

alessandro-danti commented 10 years ago

Andrei: I have tried replacing the final line in install_extension with the latter from your post, but it doesn't seem to be working - it instead creates a huge extensions list file with a lot of whitelines in the middle...

Any other suggestion?

advornic commented 10 years ago

Please try:

lines = []
try:
   lines = [x.strip() for x in open(BOOT_EXTENSIONS).readlines() if x.strip()]
except IOError:
   # file missing
   pass
open(BOOT_EXTENSIONS, 'w').write('\n'.join(lines + [line]))
alessandro-danti commented 10 years ago

This one works fine, thanks Andrei :)