google / textfsm

Python module for parsing semi-structured text into python tables.
Apache License 2.0
1.09k stars 168 forks source link

Named Records?? #29

Closed ray-linn closed 4 years ago

ray-linn commented 6 years ago

hi, Sir

I have some text files store data like machine, software part# and hardware part#. the format is as something like: Model: Y300 SN: 1111111111 Ship Date: 2018/4/25 State: Cancel ... SW 1111 Windows 10 SW 1112 Office 2013 SW 1113 Eclipse 3.0 ... HW 2012 Intel CPU I6 --- HW 2013 4GB DIMM ..... It is a clear there are 2 one-to-many associated tables.. (machine -->hardware , machine-->software) However when using textfsm to parser the text , all parts will be store into a single record, that is bad for further processing.(I can not recognized it is SW or HW just from part#)

My current solution is spilt my template into SW.fsm and HW.fsm, it can work, but no elegant. My I know if TextFSM can provide a feature like naming records , which allow me to create tables to store value by logic, the template can work like this

Record("Order") PK ${SN} # Claim a record with a Primary Key Record ("Order") one-many Record ("SW") # associate other record Record ("Order") one-many Record("HW") # associate other record

Model: ${modle} -> Record("Order") SW ${SW} ... ->Record("SW") HW ${SW} ... ->Record("HW")

After parse , there will be 3 tables in memory present with 2 1-to-many relationship.

BR RAy

harro commented 5 years ago

If we want to produce a result that is easy to post parse in a single pass - what about the following?

Value Filldown Model (\S+) Value Key,Filldown SerialNo (\S+) Value PartType (SW|HW) Value Key PartNo (\d+) Value PartDesc (.*)

The table when then look like:

Y300, 1111111111, SW, 1111, Windows 10 Y300, 1111111111, SW, 1112, Office 2013 Y300, 1111111111, SW, 1113, Eclipse 3.0 ... Y300, 1111111111, HW, 2012, Intel CPU I6 Y300, 1111111111, HW, 2013, 4GB DIMM

With the unique Key for each row being SN & PartNo

sumkincpp commented 4 years ago

Postprocessing of course a choice, but having multiple linked record types is better, since it will be builtin in engine itself.

For simple consequent unrelated records syntax might be based on State, i.e.

State -> TableName
harro commented 4 years ago

@sumkincpp I'm not sure what you mean by linking record types?

gachteme commented 4 years ago

@harro from what I can tell there are two ideas flowing around here: one being creating a graph structure between distinct objects that the user defines (linking), the other is a purely hierarchical data type like machine.port.isopen. Personally I already have some post processing tools that parse the hierarchical data structures for the machines I use and it isn’t too hard to do there.

The hierarchical structure would definitely be useful for me, but it doesn’t seem to fit into the current output format, is easy to do with some post processing, and wouldn’t help much unless you had extremely large files you were parsing.

harro commented 4 years ago

Thanks @gachteme for the explanation. The favored approach is to use post processing where possible, unless there is an imperative otherwise. @ray-linn, If the proposed workaround above doesn't work for some reason, let me know by reopening the issue.