Open Giackgamba opened 11 months ago
Hi, sorry for the delayed reply. Yes, Spark is not required, and you can use cobol-parser
dependency that does not require Spark (still requires Scala dependency as a library).
Here is an example of Cobrix used without Spark to convert some mainframe data to JSON expressed as a unit test: https://github.com/AbsaOSS/cobrix/blob/master/cobol-converters/src/test/scala/za/co/absa/cobrix/cobol/converters/extra/SerializersSpec.scala
One important detail. When Cobrix is used with Spark, it converts binary files to Spark dataframes and uses Spark type model. But when Spark is not used, you can use a custom RecordHandler. An example of such a handler is in the above test suite. It uses Array[Any]
(in Java it would be Object[]
probably.)
Let me know if you have any more questions on this.
Background
Hi! I'm not an expert on COBOL/EBCDIC data structures, but I'm implementing a CDC scenario using Flink (in java), and I'd have some binary field to decode, given a playbook.
In the README you say that "The COBOL copybooks parser doesn't have a Spark dependency and can be reused for integrating into other data processing engines".
Question
Is it really the case? What is roughly the process to decode a single message? Are there any examples not involving the spark "wrapper"?
Thank you in advance