google / textfsm

Python module for parsing semi-structured text into python tables.
Apache License 2.0
1.11k stars 170 forks source link

textfsm Cannot recognize '-' #125

Open tuobi555 opened 1 month ago

tuobi555 commented 1 month ago

textfsm 1.1.3 textfsm Cannot recognize '-' image

If changed to \ W, it can be recognized, but it also recognizes many erroneous information image

mjbear commented 1 month ago

@tuobi555 Your hyphen/dash is escaped as it needs to be so that's good.

If you haven't yet, it would be advisable to temporarily trim down your regex past the ${Device} until you're able to get that part capturing correctly. And from there continue to add portions of the regex back until it misbehaves again.

What platform and device is this output from? Without a copy of your template and a (sanitized) copy of your raw output there's not much to help with.

tuobi555 commented 1 month ago

@tuobi555 Your hyphen/dash is escaped as it needs to be so that's good.

If you haven't yet, it would be advisable to temporarily trim down your regex past the ${Device} until you're able to get that part capturing correctly. And from there continue to add portions of the regex back until it misbehaves again.

What platform and device is this output from? Without a copy of your template and a (sanitized) copy of your raw output there's not much to help with.

cs01.txt textfsm.txt

mjbear commented 1 month ago

Hello @tuobi555 Ah, this is Huawei VRP. There are quite a few command outputs in that 5 MByte cs01.txt file.

:bulb: Rather than parse a mountain of output with a single template, it would be better to create a template per command. (And some of the work may already exist over at ntc-templates [hyperlink below].)

From your screenshots I figure you are intending to parse "environmental" related items (Fans, etc). This appears to be the display device command.

Your template

Value List Device (\w+\W?\w+\W?\w+)
Value List Device_state (\w+)

Start
 ^[\w]{1,4}\s+\-\s+${Device}\s+\S+\s+\S+\s+\S+\s+${Device_state}\s+\S+\ *$$

Regarding the template, instead of two lists, why not capture multiple dictionaries of the two pairs {device, device_state}? The benefit here is the ability to loop over a list of dictionaries and access both values more easily than having to "track" and access the index across the second list. (Example at the end of this post.)

Your raw CLI output

<ABCDEFG_HIJ_cs01>display device
CE12808S's Device status:
-------------------------------------------------------------------------------------------
Slot  Card   Type                     Online   Power Register     Alarm     Primary        
-------------------------------------------------------------------------------------------
3     -      CE-L24LQ-EC1             Present  On    Registered   Normal    NA             
4     -      CE-L24LQ-EC1             Present  On    Registered   Normal    NA             
6     -      CE-L24LQ-EC1             Present  On    Registered   Normal    NA             
7     -      CE-L24LQ-EC1             Present  On    Registered   Normal    NA             
8     -      CE-L48XS-EC              Present  On    Registered   Normal    NA             
9     -      CE-MPUA-S                Present  On    Registered   Normal    Master         
10    -      CE-MPUA-S                Present  On    Registered   Normal    Slave          
11    -      CE-SFUC-S                Present  On    Registered   Normal    NA             
12    -      CE-SFUC-S                Present  On    Registered   Normal    NA             
13    -      CE-SFUC-S                Present  On    Registered   Normal    NA             
14    -      CE-SFUC-S                Present  On    Registered   Normal    NA             
15    -      CMU(MPU 9)               Present  On    Registered   Normal    Master         
16    -      CMU(MPU 10)              Present  On    Registered   Normal    Slave          
PWR1  -      PHD-3000WA               Present  On    Registered   Normal    NA             
PWR2  -      PHD-3000WA               Present  On    Registered   Normal    NA             
PWR3  -      PHD-3000WA               Present  On    Registered   Normal    NA             
PWR4  -      PHD-3000WA               Present  On    Registered   Normal    NA             
FAN1  -      FAN-600A-B               Present  On    Registered   Normal    NA             
FAN2  -      FAN-600A-B               Present  On    Registered   Normal    NA             
FAN3  -      FAN-600A-B               Present  On    Registered   Normal    NA             
FAN4  -      FAN-600A-B               Present  On    Registered   Normal    NA             
FAN5  -      FAN-600A-B               Present  On    Registered   Normal    NA             
FAN6  -      FAN-600A-B               Present  On    Registered   Normal    NA             
-------------------------------------------------------------------------------------------

A template for huawei_vrp display device does not yet exist over at ntc-templates, but I'll assist and we can make that happen. If I do that I'd plan to capture most if not all the columns of data.

How's this sound? :grinning:

Example on the list vs dictionary:

{
    "Device": [
        "CE-L24LQ-EC1",
... snipped ...
    ],
    "Device_state": [
        "Normal",
... snipped ...
    ]
}

Versus

[
    {
        "Device": "CE-L24LQ-EC1",
        "Device_state": "Normal"
    },
... snipped ...
]
mjbear commented 1 month ago

Teaser

[
... snipped ...
    {
        "ALARM_STATUS": "Normal",
        "CARD": "-",
        "DEVICE_TYPE": "CE-SFUC-S",
        "ONLINE_STATUS": "Present",
        "POWER_STATUS": "On",
        "PRIMARY_STATUS": "NA",
        "REGISTER_STATUS": "Registered",
        "SLOT": "14"
    },
    {
        "ALARM_STATUS": "Normal",
        "CARD": "-",
        "DEVICE_TYPE": "CMU(MPU 9)",
        "ONLINE_STATUS": "Present",
        "POWER_STATUS": "On",
        "PRIMARY_STATUS": "Master",
        "REGISTER_STATUS": "Registered",
        "SLOT": "15"
    },
    {
        "ALARM_STATUS": "Normal",
        "CARD": "-",
        "DEVICE_TYPE": "CMU(MPU 10)",
        "ONLINE_STATUS": "Present",
        "POWER_STATUS": "On",
        "PRIMARY_STATUS": "Slave",
        "REGISTER_STATUS": "Registered",
        "SLOT": "16"
    },
... snipped ...
]
tuobi555 commented 1 month ago

@mjbear Indeed, following your suggestions would be much easier. As a novice in DevOps, I lack experience and my coding skills are not strong. My initial idea was to use a single TextFSM template to extract everything, so that it could directly generate a JSON and then transfer it to Excel for convenient data analysis. If it were to be split up now, with each command having its own template, I would have to find a way to integrate the data in the code. Fortunately, I now have AI assistance; otherwise, it would take me several months with my abilities. Thank you all for your enthusiastic help. However, I am still curious why my writing style, it does not recognize the '-'.

mjbear commented 1 month ago

@tuobi555 No worries, we all start somewhere and learn new things every day.

Ah, I was thinking you might have been using a library such as Paramiko/Netmiko to connect to the device and pull output which you were then parsing with TextFSM.

It may be necessary to "anchor" the regexes to make them more rigid -- loose regexes can match in situations where you don't want them to.

Combining the structured output (from TextFSM) into a single object or output file wouldn't be too difficult.

Does the display device output really differ from what you provided where the slot is sometimes not present in the leftmost "column"?

<HUAWEI> display device
Device status:
--------------------------------------------------------------------------------------
Slot  Card   Type                Online   Power Register     Alarm     Primary
--------------------------------------------------------------------------------------
1     -      CE6850-48S4Q-EI     Present  On    Registered   Normal    Master
      FAN2   FAN-40SA-B          Present  On    Registered   Normal    NA
      PWR2   W1PA04BF0           Present  On    Registered   Normal    NA
--------------------------------------------------------------------------------------

reference: https://support.huawei.com/enterprise/en/doc/EDOC1100074755/91652f17/device-status-checking-commands#EN-US_CLIREF_0141119950

tuobi555 commented 1 month ago

AS02.log @mjbear The CE6800 series, CE128, and CE168 series, although all referred to as the CE series, have some differences in the distribution of display content. Moreover, the hardware configuration of my current environment is the same, and I am unable to provide sample data that shows the differences. Currently, there is also a CE68 series at the access layer available for your reference. I am very grateful to the enthusiastic friends for their help.

mjbear commented 1 month ago

Hello @tuobi555

:bulb: Retrieving one large blob/string of commands from a device and parsing it with a single template is asking for difficulty. You should retrieve and parse command output individually (one command at a time) to simplify your problem.

:question: How are you retrieving your command output? :question: Are you doing this via a programming language and/or automation software?

:exclamation: There are some syntactically incorrect commands given the output saying Error: Unrecognized command found at '^' position.

:question: It doesn't make sense to parse both the output of display vlan and display vlan brief. Wouldn't the normal vlan output contain everything as brief and more? (Same situation with display stp and display stp brief as well as display ospf lsdb and display ospf lsdb router and so on.)

:exclamation: Wow, this is around 116 commands in one blob (cs01), which is when I stopped compiling my list below.

grep display ~/Downloads/cs01.txt | wc -l
116

grep Unrecognized ~/Downloads/cs01.txt | wc -l
17

(Here's my [partial] guess in alphabetical order based on the "cs01" raw output you shared and Huawei command reference.)

display acl all
display arp
display clock
display cpu
display current-configuration
display device
display device elabel
display device fan
display device power / **display device power system**
display dldp
display dldp statistics
display info-center
display interface vlanif
display lldp neighbor
display mac-address
display memory
display port vlan
display saved-configuration
**display stp** / display stp brief
display stp region-configuration
display stp tc-bpdu statistics
display stp topology-change
display switchover state
display version
**display vlan** / display vlan brief
display vrrp statistics
display users all

:bulb: In case you end up having any interest in ntc-templates, here's some information: https://pynet.twb-tech.com/blog/netmiko-and-textfsm.html https://github.com/networktocode/ntc-templates/ https://github.com/networktocode/ntc-templates/tree/master/tests/huawei_vrp https://ntc-templates.readthedocs.io/en/latest/user/lib_overview/ https://ntc-templates.readthedocs.io/en/latest/dev/dev_parser/#index-file

mjbear commented 1 month ago

@tuobi555 Would you be able to get sanitized output for display device elabel and put it in a code block in your response? (put X's in for your barcode, serial number, and any other sensitive information)

(The example in Huawei's display device elabel command docs may have output removed.)

mjbear commented 1 month ago

@tuobi555 Are you willing to split up your process to one command at a time?

Do you have output for the command display device elabel?

tuobi555 commented 1 month ago

@mjbear "Of course, that's fine. Initially, I just wanted to use this method to help me monitor the status of the equipment. As long as the requirements are met, it doesn't matter. Additionally, for the CE series devices, if you want to view the device's own serial number, the command should be 'display esn'. For other series, the commands might be 'display device manufacture-info' or 'display elabel slot slot-id', where 'slot-id' is the slot number of the corresponding device."

mjbear commented 1 month ago

@tuobi555 You didn't say whether you're using Python, but regardless of the programming language it's simple to have a script loop through a list of commands and retrieve output from your VRP device.

I don't have access to any Huawei VRP devices. I'm trying to help you so that's why I need you to provide output.

Please provide output for display device elabel (put X characters in for the serial number and anything else sensitive).

If you like, I can look at your long template file but you'll need to share it.

tuobi555 commented 1 month ago

@mjbear "The information I provided earlier was all the data I had collected. If there is no output for 'esn' in it, then there's nothing more that can be done. I also cannot casually take log files from my workplace. Thank you for your enthusiastic help. Of course, you can also share the command parsing templates that have already been made, so that more people can enjoy the convenience."

mjbear commented 1 month ago

@mjbear "The information I provided earlier was all the data I had collected. If there is no output for 'esn' in it, then there's nothing more that can be done. I also cannot casually take log files from my workplace. Thank you for your enthusiastic help. Of course, you can also share the command parsing templates that have already been made, so that more people can enjoy the convenience."

@tuobi555 Ok, I'm guessing there could be some misunderstanding or mistranslation occurring. I'm not asking for sensitive or confidential data. Output could come from a lab device that is not in production or mask the potentially "sensitive data" with other characters or words. But I won't push the subject as I have a feeling this isn't going much further.

The template I created covers three display device commands since their output was similar. The new template details can be found at ntc-templates PR#1804.