Closed jaceksan closed 1 year ago
Hi @jaceksan
Thanks for improving tap-exchangeratehost. I just pulled your code from your forked repo to run
tap-exchangeratehost -c config.sample.json
But I got:
Traceback (most recent call last):
File "/home/ubuntu/project/tmp_tap_exch/tap-exchangeratehost/venv/bin/tap-exchangeratehost", line 11, in <module>
load_entry_point('tap-exchangeratehost', 'console_scripts', 'tap-exchangeratehost')()
File "/home/ubuntu/project/tmp_tap_exch/tap-exchangeratehost/tap_exchangeratehost/__init__.py", line 151, in main
while datetime.datetime.strptime(next_date, DATE_FORMAT) < datetime.datetime.utcnow():
TypeError: strptime() argument 1 must be str, not None
Can you check this and fix? I tried the original code but it ran without the error.
Thanks!
Hi @jaceksan
Thanks for improving tap-exchangeratehost. I just pulled your code from your forked repo to run
tap-exchangeratehost -c config.sample.json
But I got:
Traceback (most recent call last): File "/home/ubuntu/project/tmp_tap_exch/tap-exchangeratehost/venv/bin/tap-exchangeratehost", line 11, in <module> load_entry_point('tap-exchangeratehost', 'console_scripts', 'tap-exchangeratehost')() File "/home/ubuntu/project/tmp_tap_exch/tap-exchangeratehost/tap_exchangeratehost/__init__.py", line 151, in main while datetime.datetime.strptime(next_date, DATE_FORMAT) < datetime.datetime.utcnow(): TypeError: strptime() argument 1 must be str, not None
Can you check this and fix? I tried the original code but it ran without the error.
Thanks!
Fixed. However, I tested it with start_date=2010-01-01 and it is failing when loading data to Snowflake.
snowflake.connector.errors.ProgrammingError: 100080 (22000): Number of columns in file (170) does not match that of the corresponding table (172), use file format option error_on_column_count_mismatch=false to ignore this error
The issue is that number of columns is changing over time, some currencies did not exist in the past. There are two solutions:
date
, currency
, and value
columns.
I am thinking about what table design best fits my needs.
The ultimate solution would be to support both designs, switching between them by a new configuration parameter.
I try to implement it, please, do not merge yet.OK, the current design works if I update the definition of the Snowflake format with error_on_column_count_mismatch
set to false
:
CREATE FILE FORMAT cicd_dev.PUBLIC.meltano_format TYPE = 'CSV' ESCAPE='\\' FIELD_OPTIONALLY_ENCLOSED_BY='"' error_on_column_count_mismatch=false;
Feel free to merge this.
Eh, now I realized that maybe the current design is still not optimal. When a batch (one execution of do_sync) contains rows with different number of columns, currently the schema is populated from the first row, which may not contain all columns existing in the last row. Let me update it so the schema is generated from the last row.
Fixed, now the schema is generated from the last row (ordered by date).
Can you consider Python 3.8 support if not 3.7? Right now, it fails in pre 3.9 like:
Traceback (most recent call last):
File "/home/ubuntu/project/tmp_tap_exch/tap-exchangeratehost/venv/bin/tap-exchangeratehost", line 11, in <module>
load_entry_point('tap-exchangeratehost', 'console_scripts', 'tap-exchangeratehost')()
File "/home/ubuntu/project/tmp_tap_exch/tap-exchangeratehost/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 480, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/home/ubuntu/project/tmp_tap_exch/tap-exchangeratehost/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2693, in load_entry_point
return ep.load()
File "/home/ubuntu/project/tmp_tap_exch/tap-exchangeratehost/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2324, in load
return self.resolve()
File "/home/ubuntu/project/tmp_tap_exch/tap-exchangeratehost/venv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2330, in resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/ubuntu/project/tmp_tap_exch/tap-exchangeratehost/tap_exchangeratehost/__init__.py", line 44, in <module>
def make_schema(response: dict, dates: list[str]) -> Dict:
TypeError: 'type' object is not subscriptable
Inserting this may be enough:
# Pytho ~3.8 support
from __future__ import annotations
Fixed
@jaceksan Thx!
start_day record can be missing in the resultset, if everything has already been extracted. Prevent errors like
KeyError: '2023-03-28'