napalm-automation / napalm

Network Automation and Programmability Abstraction Layer with Multivendor support
Apache License 2.0
2.24k stars 552 forks source link

Junos SRX cluster get_environment() returns null. 15.1X49-D130.6 #1183

Closed mcroff-wish closed 4 years ago

mcroff-wish commented 4 years ago

Description of Issue/Question

Note: Please check https://guides.github.com/features/mastering-markdown/ to see how to properly format your request.

Did you follow the steps from https://github.com/napalm-automation/napalm#faq

(Place an x between the square brackets where applicable)

Setup

napalm version

(Paste verbatim output from pip freeze | grep napalm between quotes below)

napalm==2.5.0

Network operating system version

(Paste verbatim output from show version - or equivalent - between quotes below)

node0:
--------------------------------------------------------------------------
Hostname: node0
Model: srx345-dual-ac
Junos: 15.1X49-D130.6
JUNOS Software Release [15.1X49-D130.6]

node1:
--------------------------------------------------------------------------
Hostname: node1
Model: srx345-dual-ac
Junos: 15.1X49-D130.6
JUNOS Software Release [15.1X49-D130.6]

Steps to Reproduce the Issue

Error Traceback

(Paste the complete traceback of the exception between quotes below)

No errors, per se.  Just empty dataset being returned.

I inspected the elements at:
https://github.com/napalm-automation/napalm/blob/develop/napalm/junos/junos.py#L381

And also `environment_data` at:
https://github.com/napalm-automation/napalm/blob/develop/napalm/junos/junos.py#L381

All elements are empty.  I confirmed connectivity and results being returned with the `get_facts()` which does indeed work as desired.
mirceaulinic commented 4 years ago

Thanks for reporting @mcroff-wish. A good starting point to fix this - for you, or whoever is going to look into it - would be sharing:

mcroff-wish commented 4 years ago

Hi @mirceaulinic. So the variables in question contain no data. There is no XML returned for get_environment() yet get_facts() behaves as expected which is why I'm puzzled. It's not an error it's just an empty dataset.

Doing a basic print() on environment_data results in:

{}
mirceaulinic commented 4 years ago

There is no XML returned for get_environment() yet get_facts()

get_environment and get_facts don't return XML. I was referring to the XML replies the device returns for the RPC calls I mentioned just above.

mcroff-wish commented 4 years ago

@mirceaulinic Correct, I was looking for the XML returned but I did not see any being returned. I looked through junos.py and table.py with pdb but I wasn't able to find any XML actually being returned. Do you have any suggestions to further investigate? It's a little hard to understand the flow of data for this function call.

mirceaulinic commented 4 years ago

When I say check the XML output, I refer to looking on the CLI of your device and executing the command equivalent of the RPC call. For example, show chassis environment | display xml is the command line equivalent of the get-environment-information RPC I was mentioning earlier (you can check this by executing show chassis environment | display xml rpc).

I was looking for the XML returned but I did not see any being returned.

If the command above indeed returns empty XML as you're saying, then you'll probably have to raise a JTAC, but I doubt it. Otherwise, please share the XML documents as requested.

mcroff-wish commented 4 years ago

That's an interesting idea. I hadn't thought to examine the cli data. And, not surprisingly, it looks like JNPR is doing the wrong thing.

show chassis environment | display xml

<rpc-reply xmlns:junos="http://xml.juniper.net/junos/15.1X49/junos">
    <multi-routing-engine-results>

        <multi-routing-engine-item>

            <re-name>node0</re-name>

            <environment-information xmlns="http://xml.juniper.net/junos/15.1X49/junos-chassis">
                <environment-item>
                    <name>Routing Engine</name>
                    <class>Temp</class>
                    <status>OK</status>
                    <temperature junos:celsius="39">39 degrees C / 102 degrees F</temperature>
                </environment-item>
                <environment-item>
                    <name>Routing Engine CPU</name>
                    <status>OK</status>
                    <temperature junos:celsius="68">68 degrees C / 154 degrees F</temperature>
                </environment-item>
                <environment-item>
                    <name>SRX340 Chassis fan 0</name>
                    <class>Fans</class>
                    <status>OK</status>
                    <comment>Spinning at normal speed</comment>
                </environment-item>
                <environment-item>
                    <name>SRX340 Chassis fan 1</name>
                    <class>Fans</class>
                    <status>OK</status>
                    <comment>Spinning at normal speed</comment>
                </environment-item>
                <environment-item>
                    <name>SRX340 Chassis fan 2</name>
                    <class>Fans</class>
                    <status>OK</status>
                    <comment>Spinning at normal speed</comment>
                </environment-item>
                <environment-item>
                    <name>SRX340 Chassis fan 3</name>
                    <class>Fans</class>
                    <status>OK</status>
                    <comment>Spinning at normal speed</comment>
                </environment-item>
                <environment-item>
                    <name>Power Supply 0</name>
                    <class>Power</class>
                    <status>OK</status>
                </environment-item>
                <environment-item>
                    <name>Power Supply 1</name>
                    <class>Power</class>
                    <status>OK</status>
                </environment-item>
            </environment-information>
        </multi-routing-engine-item>

        <multi-routing-engine-item>

            <re-name>node1</re-name>

            <environment-information xmlns="http://xml.juniper.net/junos/15.1X49/junos-chassis">
                <environment-item>
                    <name>Routing Engine</name>
                    <class>Temp</class>
                    <status>OK</status>
                    <temperature junos:celsius="43">43 degrees C / 109 degrees F</temperature>
                </environment-item>
                <environment-item>
                    <name>Routing Engine CPU</name>
                    <status>OK</status>
                    <temperature junos:celsius="71">71 degrees C / 159 degrees F</temperature>
                </environment-item>
                <environment-item>
                    <name>SRX340 Chassis fan 0</name>
                    <class>Fans</class>
                    <status>OK</status>
                    <comment>Spinning at normal speed</comment>
                </environment-item>
                <environment-item>
                    <name>SRX340 Chassis fan 1</name>
                    <class>Fans</class>
                    <status>OK</status>
                    <comment>Spinning at normal speed</comment>
                </environment-item>
                <environment-item>
                    <name>SRX340 Chassis fan 2</name>
                    <class>Fans</class>
                    <status>OK</status>
                    <comment>Spinning at normal speed</comment>
                </environment-item>
                <environment-item>
                    <name>SRX340 Chassis fan 3</name>
                    <class>Fans</class>
                    <status>OK</status>
                    <comment>Spinning at normal speed</comment>
                </environment-item>
                <environment-item>
                    <name>Power Supply 0</name>
                    <class>Power</class>
                    <status>OK</status>
                </environment-item>
                <environment-item>
                    <name>Power Supply 1</name>
                    <class>Power</class>
                    <status>OK</status>
                </environment-item>
            </environment-information>
        </multi-routing-engine-item>

    </multi-routing-engine-results>
    <cli>
        <banner>{primary:node0}</banner>
    </cli>
</rpc-reply>

show chassis environment | display xml rpc

<rpc-reply xmlns:junos="http://xml.juniper.net/junos/15.1X49/junos">
    <rpc>
        <get-environment-information>
        </get-environment-information>
    </rpc>
    <cli>
        <banner>{primary:node0}</banner>
    </cli>
</rpc-reply>
mirceaulinic commented 4 years ago

Great @mcroff-wish. What about the other 3 RPCs needed for this function to work well: get-route-engine-information, get-temperature-threshold-information, and get-power-usage-information-detail - would you be able to find their equivalent CLI commands and share here the output? :-)

mcroff-wish commented 4 years ago

I was able to pull the data requested. It's attached below:

Commands without RPC:

show chassis temperature-thresholds | display xml
show chassis routing-engine | display xml
show chassis environment pem | display xml

Commands with RPC:

show chassis temperature-thresholds | display xml rpc
show chassis routing-engine | display xml rpc
show chassis environment pem | display xml rpc

Temp Thresholds:

root@srx> show chassis temperature-thresholds | display xml
<rpc-reply xmlns:junos="http://xml.juniper.net/junos/15.1X49/junos">
    <temperature-threshold-information xmlns="http://xml.juniper.net/junos/15.1X49/junos-chassis">
        <temperature-threshold>
            <name>Chassis default</name>
            <fan-normal-speed>35</fan-normal-speed>
            <fan-high-speed>56</fan-high-speed>
            <bad-fan-yellow-alarm>40</bad-fan-yellow-alarm>
            <bad-fan-red-alarm>65</bad-fan-red-alarm>
            <yellow-alarm>60</yellow-alarm>
            <red-alarm>75</red-alarm>
            <fire-shutdown>100</fire-shutdown>
        </temperature-threshold>
        <temperature-threshold>
            <name>Routing Engine</name>
            <fan-normal-speed>35</fan-normal-speed>
            <fan-high-speed>56</fan-high-speed>
            <bad-fan-yellow-alarm>40</bad-fan-yellow-alarm>
            <bad-fan-red-alarm>65</bad-fan-red-alarm>
            <yellow-alarm>60</yellow-alarm>
            <red-alarm>75</red-alarm>
            <fire-shutdown>100</fire-shutdown>
        </temperature-threshold>
    </temperature-threshold-information>
    <cli>
        <banner></banner>
    </cli>
</rpc-reply>

root@srx> show chassis temperature-thresholds | display xml rpc
<rpc-reply xmlns:junos="http://xml.juniper.net/junos/15.1X49/junos">
    <rpc>
        <get-temperature-threshold-information>
        </get-temperature-threshold-information>
    </rpc>
    <cli>
        <banner></banner>
    </cli>
</rpc-reply>

Routing-Engine:

root@srx> show chassis routing-engine | display xml
<rpc-reply xmlns:junos="http://xml.juniper.net/junos/15.1X49/junos">
    <route-engine-information xmlns="http://xml.juniper.net/junos/15.1X49/junos-chassis">
        <route-engine>
            <status>OK</status>
            <temperature junos:celsius="35">35 degrees C / 95 degrees F</temperature>
            <cpu-temperature junos:celsius="63">63 degrees C / 145 degrees F</cpu-temperature>
            <memory-system-total>4096</memory-system-total>
            <memory-system-total-used>1311</memory-system-total-used>
            <memory-system-total-util>32</memory-system-total-util>
            <memory-control-plane>2624</memory-control-plane>
            <memory-control-plane-used>682</memory-control-plane-used>
            <memory-control-plane-util>26</memory-control-plane-util>
            <memory-data-plane>1472</memory-data-plane>
            <memory-data-plane-used>618</memory-data-plane-used>
            <memory-data-plane-util>42</memory-data-plane-util>
            <cpu-user>6</cpu-user>
            <cpu-background>0</cpu-background>
            <cpu-system>2</cpu-system>
            <cpu-interrupt>0</cpu-interrupt>
            <cpu-idle>92</cpu-idle>
            <model>RE-SRX345-DUAL-AC</model>
            <serial-number>DS1319AF0037</serial-number>
            <start-time junos:seconds="1584485864">2020-03-17 22:57:44 UTC</start-time>
            <up-time junos:seconds="3626317">41 days, 23 hours, 18 minutes, 37 seconds</up-time>
            <last-reboot-reason>0x1:power cycle/failure </last-reboot-reason>
            <load-average-one>0.19</load-average-one>
            <load-average-five>0.09</load-average-five>
            <load-average-fifteen>0.02</load-average-fifteen>
        </route-engine>
    </route-engine-information>
    <cli>
        <banner></banner>
    </cli>
</rpc-reply>

root@srx> show chassis routing-engine | display xml rpc
<rpc-reply xmlns:junos="http://xml.juniper.net/junos/15.1X49/junos">
    <rpc>
        <get-route-engine-information>
        </get-route-engine-information>
    </rpc>
    <cli>
        <banner></banner>
    </cli>
</rpc-reply>

PEM:

root@srx> show chassis environment pem | display xml
error: command is not valid on the srx345-dual-ac
error: command is not valid on the srx345-dual-ac
mirceaulinic commented 4 years ago

Oh I think I see what's the issue: the XML reply usually is like https://github.com/napalm-automation/napalm/blob/develop/test/junos/mocked_data/test_get_environment/normal/get-environment-information.xml, while on SRX it's nested under <multi-routing-engine-results><multi-routing-engine-item>. And this is a big problematic as the question is: which environment details should we return? From node0 or node1? Because the current structure for get_environment doesn't permit returns from multiple routing engines. So I'm inclined to believe we should return only from the master routing engine. Thoughts?

mcroff-wish commented 4 years ago

So this will be a problem for a lot of the JNPR product line. MX's and EX switches commonly have multiple RE's. I would propose the following:

  1. Return the results from node0 for now.
  2. Modify the data structure to take into account multiple RE's for the platform as a future feature request.
mcroff-wish commented 4 years ago

I should have noted earlier, I tested on two SRX-345s that only have one node. Both are returning the same empty set of data from get_environment(). So I do believe we have a couple of issues going on for both clustered and stand-alone SRX's running >15 code.

mirceaulinic commented 4 years ago

I don't have a lot of experience with SRX which is why I haven't seen that sort of output before. MX, QFX and EX multi routing engines are just fine in my experience, as the output is normal, similar to the other one Iinked.

On Wed, 29 Apr 2020, 20:02 mcroff-wish, notifications@github.com wrote:

So this will be a problem for a lot of the JNPR product line. MX's and EX switches commonly have multiple RE's. I would propose the following:

  1. Return the results from node0 for now.
  2. Modify the data structure to take into account multiple RE's for the platform as a future feature request.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/napalm-automation/napalm/issues/1183#issuecomment-621371124, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD7L3FYUBQIB6IIF5GBQT4TRPBTRPANCNFSM4MQN2YRQ .

mcroff-wish commented 4 years ago

Oh that's interesting. I would have expected, probably incorrectly, that they would have been consistent. So this seems to be a different use case than the MX, QFX and EX. So maybe for now we just consider node0?

mirceaulinic commented 4 years ago

I would have expected, probably incorrectly, that they would have been consistent.

Wouldn't that be nice? :laughing:

So this seems to be a different use case than the MX, QFX and EX. So maybe for now we just consider node0?

Sounds good to me.

mirceaulinic commented 4 years ago

One more request. You provided:

root@srx> show chassis environment pem | display xml
error: command is not valid on the srx345-dual-ac
error: command is not valid on the srx345-dual-ac

Can you provide the output from the alternative command valid on this platform? (and its XML RPC)

mcroff-wish commented 4 years ago

It doesn't seem to be supported on the SRX platform with fixed power supplies. I would suspect it's relying on the routing-engine data. I can't find any command to gather more info for the branch devices.

mirceaulinic commented 4 years ago

@mcroff-wish are you able to test this branch: #1196 and confirm this fixes the issue as discussed?

mcroff-wish commented 4 years ago

@mirceaulinic I was able to test with a standalone SRX345 and a clustered SRX345. Both return env variables! Looks like it's fixed.

mirceaulinic commented 4 years ago

Great, thanks for confirming @mcroff-wish.

On Tue, 5 May 2020, 20:39 mcroff-wish, notifications@github.com wrote:

@mirceaulinic https://github.com/mirceaulinic I was able to test with a standalone SRX345 and a clustered SRX345. Both return env variables! Looks like it's fixed.

mcroff-wish commented 4 years ago

I pushed this fix by hand to stage netbox and confirmed the env variables are now populated.