metafacture / metafacture-fix

Work in progress towards an implementation of the Fix language for Metafacture
Apache License 2.0
6 stars 2 forks source link

Measurement converter #256

Open TobiasNx opened 1 year ago

TobiasNx commented 1 year ago

In context of OERSI I need to transform ms into ISO 8601:2004-Format: https://www.iso.org/standard/40874.html

83941 -> 01M24S (if I am not mistaken)

Is there a way to introduce a measurement converter perhaps not only for day-time but for other measurement units and standards too?

unit_converter("duration",is: "ms", to:"ISO8601:2004")

TobiasNx commented 1 year ago

Basis for the converter could be https://github.com/metafacture/metafacture-core/blob/master/metamorph/src/main/java/org/metafacture/metamorph/functions/DateFormat.java

and then it could be enhanced to support other kinds of measures, isos and units.

TobiasNx commented 1 year ago

For my transformation the java library https://www.joda.org/joda-time/ (java.time) could be used. Perhaps instead of creating a general measurement converter we start with a more simple dayTime-converter?

TobiasNx commented 1 year ago

Other scenario I need is to convert 14.24 MB into bytes

fsteeg commented 1 year ago

The approach sketched here is pretty involved and will probably take some time to implement. To find a good starting point that addresses the actual use case, could you expand a bit about the background in OERSI that you need this for? Is it described in some OERSI issue?

TobiasNx commented 1 year ago

For my transformation the java library https://www.joda.org/joda-time/ (java.time) could be used. Perhaps instead of creating a general measurement converter we start with a more simple dayTime-converter?

I remarked that a general measurement converter is probably to broad. There are two scenarios I need at the moment.

First is convert incoming ms into ISO 8601:2004-Format: https://www.iso.org/standard/40874.html

83941 -> 01M24S (if I am not mistaken) dateTime_converter("duration",is: "ms", to:"ISO8601:2004")

Other scenario I need is to convert 14.24 MB into bytes.

fileSize_converter("encoding.*.size",is: "ms", to:"ISO8601:2004")

I could illustrate them on a more concrete level, since both are related to orca_educast

TobiasNx commented 1 year ago

Example for duration: https://gitlab.com/oersi/oersi-etl/-/merge_requests/273 They fail since they are ms. (See commit: https://gitlab.com/oersi/oersi-etl/-/merge_requests/273/diffs?commit_id=7aab1b1a0a41cc341f65ade93ce9806fe30953f1)

Example for size: https://gitlab.com/oersi/oersi-etl/-/merge_requests/274 ("contentSize": "492.08 MB" should be "515983278.08") (See commit: https://gitlab.com/oersi/oersi-etl/-/merge_requests/274/diffs?commit_id=589d53c51b449eadecc9ed0f9a26fdf7c2820bbd)

TobiasNx commented 1 year ago

Example for size: https://gitlab.com/oersi/oersi-etl/-/merge_requests/274 ("contentSize": "492.08 MB" should be "515983278.08")

(See: https://gitlab.com/oersi/oersi-etl/-/merge_requests/274/diffs?commit_id=589d53c51b449eadecc9ed0f9a26fdf7c2820bbd)

fsteeg commented 1 year ago

For a general approach in metafacture-fix, we might want to use the existing format method, perhaps with some additional options, so roughly something like format("duration", "ISO8601", input: "ms"). However, to get the use cases clearer and have a working solution in OERSI sooner, I suggest we implement custom Java functions in the OERSI project first, and based on that consider integrating it in metafacture-fix.

blackwinter commented 1 year ago

format("duration", "ISO8601", input: "ms")

Unfortunately, this would not be compatible with the current format string parameter.