google / textfsm

Python module for parsing semi-structured text into python tables.
Apache License 2.0
1.11k stars 171 forks source link

pls help with multiple text lines parsing into a LIST #96

Closed Kennyisnothere closed 8 months ago

Kennyisnothere commented 3 years ago

text as below: VID Ports 1 UT:Eth-Trunk39(D) Eth-Trunk100(D) 10GE1/0/3(D) 10GE1/0/4(D) 10GE2/0/3(D) 10GE2/0/4(D) GE1/0/1(U) GE1/0/3(U) GE1/0/5(U) GE1/0/7(U) GE1/0/9(U) GE1/0/11(U) 20 TG:Eth-Trunk1(U) Eth-Trunk2(U) Eth-Trunk3(U) Eth-Trunk4(U) Eth-Trunk5(U) Eth-Trunk6(U) Eth-Trunk7(U) Eth-Trunk8(U) Eth-Trunk9(U) Eth-Trunk10(U) Eth-Trunk11(U) Eth-Trunk12(U)

textfsm template as below: Value VID (\d+) Value List Port (Eth-Trunk\d+|10GE\d+/\d+/\d+|GE\d+/\d+/\d+)

Start ^ +.* -> Continue.Record ^ +${VID} +(UT|ST|UT|MP|TG):${Port}( -> Continue ^ +${VID} +\S+)+ +${Port}( -> Continue ^ +${VID} +\S+) +\S+ +${Port}( -> Continue ^ +${VID} +\S+)+ +\S+ +\S+ +${Port}( -> Continue ^ +${Port}( -> Continue ^ +\S+)+ +${Port}( -> Continue ^ +\S+) +\S+)+ +${Port}( -> Continue ^ +\S+) +\S+)+ +\S+ +${Port}( -> Continue ^ +\S+) +\S+)+ +\S+ +\S+ +${Port}( -> Continue

result as below: =============== RESTART: C:\Users\kennytan\Desktop\Python\Huawei - py\HW-TexFSM1.py ============== VID Port

1 ['Eth-Trunk39', 'Eth-Trunk100', '10GE1/0/3', '10GE1/0/4'] ['10GE2/0/3', '10GE2/0/4', 'GE1/0/1', 'GE1/0/3'] ['GE1/0/5', 'GE1/0/7', 'GE1/0/9', 'GE1/0/11'] 20 ['Eth-Trunk1', 'Eth-Trunk2', 'Eth-Trunk3', 'Eth-Trunk4'] ['Eth-Trunk5', 'Eth-Trunk6', 'Eth-Trunk7', 'Eth-Trunk8'] ['Eth-Trunk9', 'Eth-Trunk10', 'Eth-Trunk11', 'Eth-Trunk12']

how can i make the second & third list merging into the first list, likewise for the last three lists merging into one list??? ?? Many thanks!

jmcgill298 commented 3 years ago

The first line should only use Continue.Record for lines that start new VID Ports something like this: ^\d+ +(UT|ST|UT|MP|TG): -> Continue.Record

The final match for the different sections should not have Continue, as you do not want to continue trying to find additional matches for those lines. Dropping the Continue will finish finding matches for the current output line, pick up the next line, and start looking for matches back at the top of the current State (Start in this example). ^ +${VID} +\S+)+ +\S+ +\S+ +${Port}( -> Continue -> ^ +${VID} +\S+)+ +\S+ +\S+ +${Port}( ^ +\S+) +\S+)+ +\S+ +\S+ +${Port}( -> Continue -> ^ +\S+) +\S+)+ +\S+ +\S+ +${Port}(

You also don't want to capture VID each time, it should just be ^\d+ +(UT|ST|UT|MP|TG): after the first capture

Kennyisnothere commented 3 years ago

The first line should only use Continue.Record for lines that start new VID Ports something like this: ^\d+ +(UT|ST|UT|MP|TG): -> Continue.Record

The final match for the different sections should not have Continue, as you do not want to continue trying to find additional matches for those lines. Dropping the Continue will finish finding matches for the current output line, pick up the next line, and start looking for matches back at the top of the current State (Start in this example). ^ +${VID} +\S+)+ +\S+ +\S+ +${Port}( -> Continue -> ^ +${VID} +\S+)+ +\S+ +\S+ +${Port}( ^ +\S+) +\S+)+ +\S+ +\S+ +${Port}( -> Continue -> ^ +\S+) +\S+)+ +\S+ +\S+ +${Port}(

You also don't want to capture VID each time, it should just be ^\d+ +(UT|ST|UT|MP|TG): after the first capture

Thanks for the reply. I will try and get back later

Kennyisnothere commented 3 years ago

image

Hi @jmcgill298 ,

HW-TexFSM(disp vlan).template.txt dis vlan.txt HW-TexFSM(disp vlan).py.txt

i tried your advice but it still did not work as desired. maybe it's my fault that i did not state the problem to the point in the first place. i have attached the template and related files for your reference. i hope it will make more sense. thanks again!

jmcgill298 commented 3 years ago

@Kennyisnothere my advice works, but from what you have shown, you have not implemented it.

I advised to change the ^ +.* -> Continue.Record to ^ +\d+ +(UT|ST|UT|MP|TG): -> Continue.Record; you have changed this to capture the VID and Port groups, which will not yield the results you want. The purpose of this opening Continue.Record is to detect all new VIDs after the first one, which signals to FSM that previous VID Port assignments have completed, and they need to be recorded. Using Capture groups in this line will have 2 effects: 1) having a spurious opening capture, and 2) overwriting/adding to groups from the previous intended entry data from what should be the next entry.

I also advised not capturing the VID group after the first match, which is to change ${VID} -> \d+ +(UT|ST|UT|MP|TG):, you removed the capturing of the characters for VID altogether.

In the template file you have recently attached, I also see an EOF, which again will not yield the results you want. TextFSM has an implicit EOF -> Record, adding EOF overwrites this and does not Record at the end of the output (this is handy when using Filldown, as that feature creates spurious final entries). Since entries are only being recorded when new VIDs are identified, the final entry will not be recorded if TextFSM does not also Record on EOF.

I would also like to point out an additional problem in the way you currently have this template. The template as it stands requires 4 Port assignments and limits to 12, this will not be reality across your environment. I highly recommend you test against data like this:

   1         UT:Eth-Trunk39(D)
   2         UT:Eth-Trunk39(D)  Eth-Trunk100(D)
   3         UT:Eth-Trunk39(D)  Eth-Trunk100(D) 10GE1/0/3(D)
   4         UT:Eth-Trunk39(D)  Eth-Trunk100(D) 10GE1/0/3(D)    10GE1/0/4(D)
   5         UT:Eth-Trunk39(D)  Eth-Trunk100(D) 10GE1/0/3(D)    10GE1/0/4(D)    
                10GE2/0/3(D)    
   10       UT:Eth-Trunk39(D)  Eth-Trunk100(D) 10GE1/0/3(D)    10GE1/0/4(D)    
                10GE2/0/3(D)    10GE2/0/4(D)
   100     UT:Eth-Trunk39(D)  Eth-Trunk100(D) 10GE1/0/3(D)    10GE1/0/4(D)    
                10GE2/0/3(D)    10GE2/0/4(D)    GE1/0/1(U)
   1000   UT:Eth-Trunk39(D)  Eth-Trunk100(D) 10GE1/0/3(D)    10GE1/0/4(D)    
                10GE2/0/3(D)    10GE2/0/4(D)    GE1/0/1(U)      GE1/0/3(U)  
   1001   UT:Eth-Trunk39(D)  Eth-Trunk100(D) 10GE1/0/3(D)    10GE1/0/4(D)    
                10GE2/0/3(D)    10GE2/0/4(D)    GE1/0/1(U)      GE1/0/3(U)  
                10GE2/0/3(D)    10GE2/0/4(D)    GE1/0/1(U)      GE1/0/3(U)  
                10GE2/0/3(D)    10GE2/0/4(D)    GE1/0/1(U)      GE1/0/3(U)  
                10GE2/0/3(D)    10GE2/0/4(D)    GE1/0/1(U)      GE1/0/3(U)  
                10GE2/0/3(D)    10GE2/0/4(D)    GE1/0/1(U)      GE1/0/3(U)  
                10GE2/0/3(D)    10GE2/0/4(D)    GE1/0/1(U)      GE1/0/3(U)  
                10GE2/0/3(D)    10GE2/0/4(D)    GE1/0/1(U)      GE1/0/3(U)  
                10GE2/0/3(D)    10GE2/0/4(D)    GE1/0/1(U)      GE1/0/3(U)  
                10GE2/0/3(D)    10GE2/0/4(D)    GE1/0/1(U)      GE1/0/3(U)  
   1002
Kennyisnothere commented 3 years ago

image

Hi @jmcgill298,

Thanks for your guidance along the way. i finally get the expected result, although there are still few things i could not really grasp. i have also attached these file and really apreciate if you have time to look at.

  1. for the first line, i have corrected to ^ +\d+ +(UT|ST|UT|MP|TG): -> Continue.Record. sorry for the blunder i made last time. and moreover your elaborate explanation on it really helped me get on the right track.

  2. i could not understand what it mean ''not capturing the VID group after the first match, which is to change ${VID} -> \d+ +(UT|ST|UT|MP|TG):''. so i might as well continued with what it had been there to try, but somehow the result is expected. i would really appreciate that if you have more comments on it.

  3. for the implicit EOF -> Record, i only heard of it, never knew how to use it. always try to put it into use and see what is its outcome then. thanks for your explanation. i think i know its usage better, as another way to deal with the situation where you mentioned "this is handy when using Filldown, as that feature creates spurious final entries"...ps. i only know using keyword REQUIED to handle this solution.

  4. yes the previous template (4 Port assignments and limits to 12) is purposely made for this to see if it can be parsed. i also used the template you proposed to verify. thanks again!

dis vlan1.txt HW-TexFSM(disp vlan)1.template.txt

result: image

jmcgill298 commented 3 years ago

What I meant by not capturing the VID is not using a ${VID} for each VID, but only the fist time it appears. You will still need to match the contents of the VID section in order to capture the next interface associated with the VID. TextFSMs ${GROUP_NAME} syntax is translated into a named capture group like (?P<GROUP_NAME>regex_pattern_defined_in_template). There is no need to rewrite the contents of the VID group for each additional port on the first line.

Looking at the template, I would also highly recommend using a catchall and making that raise an Error. Without catching unknown lines and raising an error, you will be using the template on data that does not capture the full output, and not even know that your data is incomplete (for example, what happens if the interface name starts with 100GE?)

Start:
  ^\s*VID\s+Port\s*$$ -> Data

Data
  ...
  ^\s*-+\s*$$
  ^\s*$$
  ^. -> Error
harro commented 8 months ago

Closing as stale issue. Please reopen if that is not the case?