AbsaOSS / cobrix

A COBOL parser and Mainframe/EBCDIC data source for Apache Spark
Apache License 2.0
138 stars 78 forks source link

FILLER fileds not showing up when a Particular parent field has only FILLER with child #510

Closed anu17011993 closed 2 years ago

anu17011993 commented 2 years ago

Describe the bug

We have a scenario where the Copybook has FILLER OCCURS under a particular parent and there are child fields under FILLER OCCURS but the schema that we get after reading dataframe does not have these fields (CHILD1, CHILD2).

Copybook (if possible)

01 RECORD.
    03 PARENT2.
        05 FIELD1   PICX(65).
        05 FIELD2.
            07 FILLER OCCURS 12 TIMES.
                10 CHILD1 PIC S9(7) COMP-3.
                10 CHILD2 PIC S99V99999 COMP-3.
        05 FIELD3 PIC X.

The final schema of dataframe looks like this |--RECORD |--PARENT2 |--FIELD1 |--FIELD3

Now if the FILLER OCCURS 12 TIMES has any siblings with it under the same parent FIELD2 as in the below copybook then the FILLER field shows up in the dataframe. But when it does not have any siblings under the parent then the fields don't show up even if we set drop_group_fillers and drop_value_fillers option to false.

01 RECORD.
    03 PARENT2.
        05 FIELD1   PICX(65).
        05 FIELD2.
            07 FILLER OCCURS 12 TIMES.
                10 CHILD1 PIC S9(7) COMP-3.
                10 CHILD2 PIC S99V99999 COMP-3.
            07 FIELD PIC X.
        05 FIELD3 PIC X.
yruslan commented 2 years ago

Thanks for the report. Will check. As a workaround for now you can rename the FILLER to, say, FILLER1.

anu17011993 commented 2 years ago

Thanks @yruslan

yruslan commented 2 years ago

This should be fixed in master. You can try it now by compiling it from the source or wait for the new release.