voxmedia / tap-instagram

Singer Tap for the Instagram Graph API
Apache License 2.0
5 stars 6 forks source link

KeyError 'end_date' for user_insights_audience stream. #13

Open acarter24 opened 1 year ago

acarter24 commented 1 year ago

user_insights_audience stream returns:

{'data': [
{'name': 'audience_city', 'period': 'lifetime', 'values': [{'value': {'Salford, England': 21, 'Manchester, England': 189, 'London, England': 275, 'Lancaster, England': 36, 'Leicester, England': 65, 'Liverpool, England': 137, 'Hong Kong, Hong Kong': 27, 'York, England': 45, 'Newcastle upon Tyne, England': 54, 'Reading, England': 13, 'St Andrews, Scotland': 18, 'Dublin, Dublin': 12, 'Bristol, England': 69, 'Kingston upon Hull, England': 12, 'Jakarta, Jakarta': 13, 'Hatfield, England': 20, 'Paris, Île-de-France': 13, 'Coventry, England': 13, 'Aberdeen, Scotland': 30, 'Exeter, England': 48, 'Nottingham, England': 72, 'Durham, England': 13, 'Shanghai, Shanghai': 122, 'Glasgow, Scotland': 78, 'Delhi, Delhi': 14, 'Portsmouth, England': 57, 'Leeds, England': 80, 'Norwich, England': 15, 'Edinburgh, Scotland': 60, 'Stoke-on-Trent, England': 18, 'Sheffield, England': 58, 'Birmingham, England': 70, 'Bangkok, Bangkok': 14, 'Falmouth, England': 56, 'Stockport, England': 11, 'Northampton, England': 12, 'Cardiff, Wales': 98, 'Tehran, Tehran Province': 13, 'Belfast, Northern Ireland': 16, 'Huddersfield, England': 29, 'Southampton, England': 35, 'Mumbai, Maharashtra': 13, 'Bath, England': 38, 'Brighton and Hove, England': 18, 'Canterbury, England': 12}}], 'title': 'Audience town/city', 'description': "The towns/cities of this profile's followers", 'id': '***/insights/audience_city/lifetime'}, {'name': 'audience_country', 'period': 'lifetime', 'values': [{'value': {'DE': 56, 'BD': 6, 'RU': 10, 'TW': 17, 'HK': 27, 'PT': 8, 'JP': 7, 'FR': 105, 'SA': 13, 'QA': 7, 'BR': 11, 'MA': 6, 'SG': 8, 'DZ': 6, 'KE': 13, 'ID': 23, 'GB': 3133, 'IE': 23, 'OM': 7, 'CA': 18, 'US': 91, 'EG': 15, 'AE': 14, 'CH': 7, 'IN': 83, 'ZA': 10, 'MU': 6, 'IQ': 7, 'IR': 21, 'GR': 11, 'MX': 12, 'IT': 14, 'CN': 147, 'KW': 11, 'MY': 19, 'ES': 16, 'TH': 23, 'AU': 26, 'CY': 9, 'VN': 7, 'PH': 42, 'NG': 26, 'PK': 27, 'NL': 13, 'TR': 14}}], 'title': 'Audience country', 'description': "The countries of this profile's followers", 'id': '***/insights/audience_country/lifetime'}, {'name': 'audience_gender_age', 'period': 'lifetime', 'values': [{'value': {'F.13-17': 12, 'F.18-24': 1185, 'F.25-34': 1075, 'F.35-44': 140, 'F.45-54': 100, 'F.55-64': 39, 'F.65+': 18, 'M.13-17': 2, 'M.18-24': 349, 'M.25-34': 342, 'M.35-44': 82, 'M.45-54': 46, 'M.55-64': 14, 'M.65+': 3, 'U.13-17': 7, 'U.18-24': 425, 'U.25-34': 307, 'U.35-44': 89, 'U.45-54': 31, 'U.55-64': 17, 'U.65+': 9}}], 'title': 'Gender and age', 'description': "The gender and age distribution of this profile's followers", 'id': '***/insights/audience_gender_age/lifetime'}, {'name': 'audience_locale', 'period': 'lifetime', 'values': [{'value': {'el_GR': 8, 'ru_RU': 9, 'it_IT': 15, 'ro_RO': 3, 'tr_TR': 10, 'id_ID': 6, 'pt_BR': 14, 'en_PH': 1, 'th_TH': 6, 'ja_JP': 5, 'fr_FR': 98, 'cs_CZ': 3, 'hu_HU': 1, 'de_DE': 36, 'ms_MY': 1, 'nb_NO': 4, 'zh_HK': 14, 'zh_TW': 16, 'es_MX': 3, 'sk_SK': 2, 'es_ES': 14, 'es_CL': 1, 'nl_NL': 4, 'es_LA': 11, 'sv_SE': 2, 'fa_IR': 2, 'bg_BG': 2, 'vi_VN': 5, 'cy_GB': 2, 'fi_FI': 1, 'ar_AR': 13, 'en_GB': 2754, 'ko_KR': 5, 'en_US': 889, 'uk_UA': 1, 'en_IN': 48, 'zh_CN': 271, 'ar_AE': 1, 'pt_PT': 5}}], 'title': 'Location', 'description': "The locales by country code of this profile's followers", 'id': '***/insights/audience_locale/lifetime'}]}

parse_response fails with a KeyError looking for end_date in this data, but the period value is lifetime so not sure if this end_date key will be present.

Stack trace below

```python 2023-03-08T15:23:25.171108Z [info ] Traceback (most recent call last): cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.171860Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/bin/tap-instagram", line 8, in cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.172429Z [info ] sys.exit(TapInstagram.cli()) cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.172861Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/click/core.py", line 1130, in __call__ cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.173463Z [info ] return self.main(*args, **kwargs) cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.174023Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/click/core.py", line 1055, in main cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.174494Z [info ] rv = self.invoke(ctx) cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.174871Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/click/core.py", line 1404, in invoke cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.175564Z [info ] return ctx.invoke(self.callback, **ctx.params) cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.176171Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/click/core.py", line 760, in invoke cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.176700Z [info ] return __callback(*args, **kwargs) cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.177238Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/singer_sdk/tap_base.py", line 499, in cli cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.177752Z [info ] tap.sync_all() cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.178318Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/singer_sdk/tap_base.py", line 379, in sync_all cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.178880Z [info ] stream.sync() cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.179322Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/singer_sdk/streams/core.py", line 1020, in sync cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.180179Z [info ] self._sync_records(context) cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.183917Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/singer_sdk/streams/core.py", line 962, in _sync_records cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.184758Z [info ] self._sync_children(child_context) cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.185483Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/singer_sdk/streams/core.py", line 1025, in _sync_children cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.186549Z [info ] child_stream.sync(context=child_context) cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.187657Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/singer_sdk/streams/core.py", line 1020, in sync cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.189509Z [info ] self._sync_records(context) cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.190156Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/singer_sdk/streams/core.py", line 946, in _sync_records cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.190613Z [info ] for record_result in self.get_records(current_context): cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.191049Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/tap_instagram/client.py", line 105, in get_records cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.191371Z [info ] for record in self.request_records(context): cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.191704Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/singer_sdk/streams/rest.py", line 323, in request_records cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.192008Z [info ] yield from self.parse_response(resp) cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.192314Z [info ] File "/home/***/vscode_projects/***_elt/***_meltano/.meltano/extractors/tap-instagram/venv/lib/python3.10/site-packages/tap_instagram/streams.py", line 851, in parse_response cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram 2023-03-08T15:23:25.192744Z [info ] "end_time": pendulum.parse(values["end_time"]).format( cmd_type=elb consumer=False name=tap-instagram producer=True stdio=stderr string_id=tap-instagram ``` <\details
prratek commented 1 year ago

@acarter24 thanks for raising this! I imagine the fix could be a pretty quick change similar to what we have here to first check if end_time is present in the dictionary we're inspecting before attempting to access it. Would you be interested in contributing a PR?

acarter24 commented 1 year ago

I'll have a go, sure :)

Looks like replication_key breaks for this stream though? it inherits from UserInsightsStream which sets replication_key='end_date', which results in another KeyError later in the process when serialising json for state.

EDIT: UserInsightsOnlineFollowersStream is a lifetime table but doesn't seem to suffer from the same issue, will have to investigate.

prratek commented 1 year ago

Huh okay. It's been long enough since I wrote this that I'm blanking on some implementation details. Without having looked two closely, seems like there are two paths here:

  1. Split out lifetime insights into their own stream without a replication_key
  2. Set the end date to some arbitrary value like the current timestamp for lifetime metrics. I'm guessing the end date doesn't need to be used in the API request for a lifetime metric anyway.