MeltanoLabs / meltano-map-transform

A map transformer which implements the `Stream Maps` capability from Meltano's tap and target SDK: https://sdk.meltano.com/
Apache License 2.0
16 stars 14 forks source link

Mapping crashing with empty stream #164

Open ovidals opened 10 months ago

ovidals commented 10 months ago

Hi, I'm facing an error when the mapper does not receive any content from the stream:

This is my config:

  mappers:
    - name: meltano-map-transformer
      variant: meltano
      pip_url: git+https://github.com/MeltanoLabs/meltano-map-transform.git
      mappings:
        - name: mapping_1
          config:
            stream_maps:
              employees:
                salary: "float(salary) if salary != '' else 0"
            stream_map_config:
              hash_seed: 95EWZh7A6DzGm6iJZZ2T`

This is getting content from a CSV file extracted with "tap-sftp" plugin.

When there is no CSV file in folder, stream is empty and I get this error:

Traceback (most recent call last): cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.901756Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/bin/meltano-map-transform", line 8, in <module> cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.902120Z [info     ]     sys.exit(StreamTransform.cli()) cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.902383Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/click/core.py", line 1157, in __call__ cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.902664Z [info     ]     return self.main(*args, **kwargs) cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.903036Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/click/core.py", line 1078, in main cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.903295Z [info     ]     rv = self.invoke(ctx)      cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.903651Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/click/core.py", line 1434, in invoke cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.904032Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.904302Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/click/core.py", line 783, in invoke cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.904614Z [info     ]     return __callback(*args, **kwargs) cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.904956Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/singer_sdk/mapper_base.py", line 134, in invoke cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.905274Z [info     ]     mapper.listen(file_input)  cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.910230Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/singer_sdk/io_base.py", line 34, in listen cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.911487Z [info     ]     self._process_lines(file_input) cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.912739Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/singer_sdk/io_base.py", line 90, in _process_lines cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.913353Z [info     ]     self._process_schema_message(line_dict) cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.913885Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/singer_sdk/mapper_base.py", line 37, in _process_schema_message cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.914339Z [info     ]     self._write_messages(self.map_schema_message(message_dict)) cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.914732Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/singer_sdk/mapper_base.py", line 33, in _write_messages cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.915297Z [info     ]     for message in messages:   cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.915875Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/meltano_map_transform/mapper.py", line 90, in map_schema_message cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.916238Z [info     ]     self.mapper.register_raw_stream_schema( cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.917076Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/singer_sdk/mapper.py", line 731, in register_raw_stream_schema cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.917690Z [info     ]     mapper = CustomStreamMap(  cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.918278Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/singer_sdk/mapper.py", line 269, in __init__ cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.918920Z [info     ]     ) = self._init_functions_and_schema(stream_map=map_transform) cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.919701Z [info     ]   File "/project/.meltano/mappers/meltano-map-transformer/venv/lib/python3.8/site-packages/singer_sdk/mapper.py", line 468, in _init_functions_and_schema cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.920471Z [info     ]     or self.raw_schema["properties"].get(prop_def, {}) cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer
2023-08-23T07:57:16.921146Z [info     ] KeyError: 'properties'         cmd_type=elb consumer=True name=meltano-map-transformer producer=True stdio=stderr string_id=meltano-map-transformer

Is there any way to configure mapper to avoid crashing with empty content?

Thank you.

edgarrmondragon commented 10 months ago

Thanks for reporting @ovidals!

What is the result you'd expect from mapping a stream with an empty schema?

ovidals commented 10 months ago

Thanks for reporting @ovidals!

What is the result you'd expect from mapping a stream with an empty schema?

Hi @edgarrmondragon , I guess if a pipeline step does not receive anything as an input, should output also an empty stream right?

For example, I had a pipeline with an extractor and a transformer before adding the mapper, and if the extractor does not extract anything, an empty stream is sent to transformer, so transformer does nothing and the pipeline ends successfully without any processing. Adding an intermediate step in the pipeline like a mapper in this case, I would expect the pipeline to behave the same, but without crashing.

Does that make sense?

Thank you!

edgarrmondragon commented 2 months ago

Fixing this should be as simple as using .get(...) in https://github.com/meltano/sdk/blob/46c14f6003baf658be0dcc25f1307c7007f5be27/singer_sdk/mapper.py#L489-L490