Implement datasource metrics

Description Implement datasource metrics

Use case or motivation behind the feature request Currently users do not know the progress of a query which is frustrating. This needs to be fixed. Now that the datasource uses Spark 3 APIs it is possible to provide metric information about the datasource progress.

Please create at least following metrics aggregated into JSON data format: Driver:

Current archive offset
Kafka offset

Task

Amount of records processed
Amount of bytes processed
Bytes per second
Records per second

Please consider implementing a precreated (hourly/automatic) buckets within the driver for earliest-latest span and binning the processed data in the tasks into these created buckets.

Please define JSON schema once initial development is done.

Related issues https://github.com/teragrep/ajs_01/issues/70 depends on this

Additional context See example at #74 and close when implemented.

teragrep / pth_06

Implement datasource metrics #75