meltano / sdk

Write 70% less code by using the SDK to build custom extractors and loaders that adhere to the Singer standard: https://sdk.meltano.com
https://sdk.meltano.com
Apache License 2.0
95 stars 69 forks source link

bug: Cannot parse SQL Server timestamp type #2711

Open HaydenNess opened 1 day ago

HaydenNess commented 1 day ago

When trying to sync (CDC in my case) a table containing a 'timestamp' column, the following error is presented. I have experienced this with both target-parquet, and target-s3.

This error can be avoided by excluding the column from the select list, but the type is often vital to enabling incremental replication.

Traceback (most recent call last):
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/sinks/core.py", line 536, in _parse_timestamps_in_record
    date_val = datetime_fromisoformat(date_val)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: Invalid isoformat string: '0000000000004651'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/bin/target-parquet", line 8, in <module>
    sys.exit(TargetParquet.cli())
             ^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)     
         ^^^^^^^^^^^^^^^^     
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/plugin_base.py", line 80, in invoke
    return super().invoke(ctx)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/target_base.py", line 565, in invoke
    target.listen(file_input) 
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/io_base.py", line 35, in listen
    self._process_lines(file_input)
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/target_base.py", line 306, in _process_lines
    counter = super()._process_lines(file_input)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/io_base.py", line 94, in _process_lines
    self._process_record_message(line_dict)
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/target_base.py", line 356, in _process_record_message
    sink._validate_and_parse(transformed_record)  # noqa: SLF001
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/sinks/core.py", line 479, in _validate_and_parse
    self._parse_timestamps_in_record(
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/sinks/core.py", line 538, in _parse_timestamps_in_record
    date_val = handle_invalid_timestamp_in_record(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/helpers/_typing.py", line 225, in handle_invalid_timestamp_in_record
    raise ValueError(msg)     
ValueError: Could not parse value '0000000000004651' for field 'updated'.
Loader failed
edgarrmondragon commented 18 hours ago

Hi @HaydenNess!

I think the SQL server tap may be at fault here, since it's declaring 0000000000004651 as a date-time string. The error is coming from https://github.com/meltano/sdk/blob/bf8384eb830ddc852bcbee43f3534f734fb79751/singer_sdk/sinks/core.py#L559

letting it go through is probably not a good option either, but I'm not sure if pyarrow would complain that it isn't a valid datetime.

Can you try using Meltano's schema override to change the type of that updated field to something that better reflects its type? Maybe integer.

HaydenNess commented 7 hours ago

Hi Edgar,

I've tried the schema override with, in various ways, without any success. It's unclear to me though if it is doing anything at all, or just doesn't affect the result.

I'll post this over to tap-mssql and see if it can be addressed there.