AbsaOSS / cobrix

A COBOL parser and Mainframe/EBCDIC data source for Apache Spark
Apache License 2.0
138 stars 77 forks source link

#678 Add the ability to generate Spark schema based on strict integral precision #691

Closed yruslan closed 4 months ago

yruslan commented 4 months ago

New option:

.option("strict_integral_precision", "true")

An example copybook:

      01  R.
                03 SEG-ID        PIC X(1).
                03 SEG1.
                  05 NUM1        PIC 9(2).
                03 SEG2 REDEFINES SEG1.
                  05 NUM2        PIC S9(9).
                03 SEG3 REDEFINES SEG1.
                  05 NUM3        PIC S9(15).

A schema without the option:

root
 |-- SEG_ID: string (nullable = true)
 |-- SEG1: struct (nullable = true)
 |    |-- NUM1: integer (nullable = true)
 |-- SEG2: struct (nullable = true)
 |    |-- NUM2: integer (nullable = true)
 |-- SEG3: struct (nullable = true)
 |    |-- NUM3: long (nullable = true)

A schema with the option:

root
 |-- SEG_ID: string (nullable = true)
 |-- SEG1: struct (nullable = true)
 |    |-- NUM1: decimal(2,0) (nullable = true)
 |-- SEG2: struct (nullable = true)
 |    |-- NUM2: decimal(9,0) (nullable = true)
 |-- SEG3: struct (nullable = true)
 |    |-- NUM3: decimal(15,0) (nullable = true)

(note integer and long are replaced with decimals with scale=0)

github-actions[bot] commented 4 months ago

JaCoCo code coverage report - 'cobol-parser'

File Coverage [85.72%] :green_apple:
ParserVisitor.scala 89.47% :green_apple:
ANTLRParser.scala 85.79% :green_apple:
CopybookParser.scala 83.75% :green_apple:
CobolSchema.scala 81.56% :green_apple:
RecordExtractors.scala 78.7% :green_apple:
DecoderSelector.scala 68.7% :green_apple:
Total Project Coverage 86.99% :green_apple:
github-actions[bot] commented 4 months ago

JaCoCo code coverage report - 'spark-cobol'

File Coverage [86.16%] :green_apple:
CobolSchema.scala 89.79% :green_apple:
CobolParametersParser.scala 85.2% :green_apple:
Total Project Coverage 80.77% :green_apple: