MeltanoLabs / meltano-map-transform

A map transformer which implements the `Stream Maps` capability from Meltano's tap and target SDK: https://sdk.meltano.com/
Apache License 2.0
16 stars 15 forks source link

`MELTANO_MAP_TRANSFORMER_STREAM_MAPS` not working and not specific #195

Open jlloyd-widen opened 7 months ago

jlloyd-widen commented 7 months ago

Meltano Hub's entry for the Meltano Map Transformer indicates that you should be able to use the env var MELTANO_MAP_TRANSFORMER_STREAM_MAPS to configure the plugin. However, this env var doesn't seem to work inspite of theoretically having provided the correct value (example below).

In addition, the use of this env var is confusing because the plugin can have several mappings in it, each with a different name. This env var therefore is misnamed because it doesn't specify which mapper it is going to modify.

Here's my use case: I have a few columns from my tap coming in as binary or that need to be made null depending on the stream.

meltano.yml

  mappers:
  - name: meltano-map-transformer
    variant: meltano
    pip_url: git+https://github.com/MeltanoLabs/meltano-map-transform.git
    settings:
    - name: stream_maps
      kind: object
    mappings:
    - name: mapping1
      config:
        stream_maps:
          'foo-bar':
            spam:
    - name: mapping2
      config:
        stream_maps:
          'foo-eggs':
            spam: str(spam)

When I use the config above, it works as expected. However, when I remove that config and set the env var as follows to modify mapping2, the expected behavior does not happen.

MELTANO_MAP_TRANSFORMER_STREAM_MAPS={"foo-eggs":{"spam":"str(spam)"}}

I need the ability to modify the config via env vars because I build my config dynamically prior to each run.

jnv commented 3 months ago

This is also biting me. Apparently there are more env variables involved, but the whole thing is pretty much undocumented: https://github.com/meltano/sdk/issues/1073

Not sure if it's relevant, but I also found this in integration tests: https://github.com/meltano/meltano/blob/8a65004cbe04ece893d7c17e084387cb27297f8f/integration/meltano-manifest/expected-manifests/meltano-manifest.json#L2082-L2095

jnv commented 3 months ago

I was digging deeper into this and all it points to the fact that setting env variable should work exactly this way:

MELTANO_MAP_TRANSFORMER_STREAM_MAPS={"foo-eggs":{"spam":"str(spam)"}}

However, I suspect it doesn't work due to bug in configuration serialization during invocation from Meltano CLI: https://github.com/meltano/meltano/issues/8507

edgarrmondragon commented 3 months ago

Got a draft PR: https://github.com/meltano/meltano/pull/8509