jf-tech / omniparser

omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
MIT License
931 stars 68 forks source link

Adding `flatfile.csv.reader` implementation and tests; fix a bug in `flatfile.fixedlength` where column match goes wrong. #178

Closed jf-tech closed 2 years ago

jf-tech commented 2 years ago

The bug is in the flatfile.fixedlength.reader.linesToNode:

func (r *reader) linesToNode(decl *EnvelopeDecl, n int) *idr.Node {
    if len(r.linesBuf) < n {
        panic(
            fmt.Sprintf("linesBuf has %d lines but requested %d lines to convert",
                len(r.linesBuf), n))
    }
    node := idr.CreateNode(idr.ElementNode, decl.Name)
    for col := range decl.Columns {
        colDecl := decl.Columns[col]
        for i := 0; i < n; i++ {
            if !colDecl.lineMatch(i, r.linesBuf[i].b) {
                continue
            }
            colNode := idr.CreateNode(idr.ElementNode, colDecl.Name)
            idr.AddChild(node, colNode)
            colVal := idr.CreateNode(idr.TextNode, colDecl.lineToColumnValue(r.linesBuf[i].b))
            idr.AddChild(colNode, colVal)
            break                                     <==== PREVIOUSLY MISSED.
        }
    }
    return node
}

v1.0.3 release missed the crucial break statement, thus for any multiple-line envelope (whether its rows based or header/footer based), if there are multiple lines to match a column, always the last one wins. This is not what our past expectation/specification was: the first line that matches should win. It's fixed now and tests have been amended to catch the issue.

codecov[bot] commented 2 years ago

Codecov Report

Merging #178 (9f7ec63) into master (77370e7) will not change coverage. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##            master      #178    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files           51        52     +1     
  Lines         2836      2971   +135     
==========================================
+ Hits          2836      2971   +135     
Impacted Files Coverage Δ
extensions/omniv21/fileformat/flatfile/csv/decl.go 100.00% <100.00%> (ø)
...tensions/omniv21/fileformat/flatfile/csv/reader.go 100.00% <100.00%> (ø)
.../omniv21/fileformat/flatfile/fixedlength/reader.go 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.