singer-io / tap-appsflyer

A Singer.io tap for extracting data from the AppsFlyer API
GNU Affero General Public License v3.0
11 stars 40 forks source link

Extraction encountered a schema violation #2

Closed BjornWesker closed 6 years ago

BjornWesker commented 6 years ago

2018-03-26 09:10:48,727Z main - INFO Running tap-appsflyer version 0.0.1 and target-stitch version 1.7.0 2018-03-26 09:10:48,866Z main - DEBUG Getting initial state 2018-03-26 09:10:48,939Z main - INFO Starting tap: tap-env/bin/tap-appsflyer --config /tmp/tap_config.json --state /tmp/tap_state.json 2018-03-26 09:10:48,960Z main - INFO Starting target: target-env/bin/target-stitch --config /tmp/target_config.json 2018-03-26 09:10:49,327Z tap - INFO do_sync() 2018-03-26 09:10:49,328Z tap - INFO Starting sync. Will sync these streams: ['installs', 'in_app_events'] 2018-03-26 09:10:49,328Z tap - INFO Syncing installs 2018-03-26 09:10:49,358Z tap - INFO GET https://hq.appsflyer.com/export/id1022393446/installs_report/v5?from=2018-03-01+00%3A00&api_token=....................................&to=2018-03-26+09%3A10 2018-03-26 09:10:49,505Z target - INFO Using Stitch import URL https://api.stitchdata.com/v2/import/batch 2018-03-26 09:11:06,293Z tap - INFO STATS: {"duration": 16.942935466766357, "status": "succeeded", "http_status_code": 200, "source": "installs"} 2018-03-26 09:11:06,293Z tap - INFO Syncing in_app_events 2018-03-26 09:11:06,295Z main - INFO State update: adding this_stream = "installs" 2018-03-26 09:11:06,295Z main - DEBUG Saving state: {'this_stream': 'installs'} 2018-03-26 09:11:06,297Z tap - INFO GET https://hq.appsflyer.com/export/id1022393446/in_app_events_report/v5?from=2018-03-01+00%3A00&api_token=....................................&to=2018-03-26+09%3A11 2018-03-26 09:12:58,589Z tap - INFO STATS: {"duration": 112.29382634162903, "status": "succeeded", "http_status_code": 200, "source": "in_app_events"} 2018-03-26 09:12:58,604Z tap - Traceback (most recent call last): 2018-03-26 09:12:58,604Z tap - File "tap-env/bin/tap-appsflyer", line 11, in 2018-03-26 09:12:58,604Z tap - sys.exit(main()) 2018-03-26 09:12:58,604Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_appsflyer/init.py", line 473, in main 2018-03-26 09:12:58,604Z tap - do_sync() 2018-03-26 09:12:58,605Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_appsflyer/init.py", line 455, in do_sync 2018-03-26 09:12:58,605Z tap - stream.sync() # pylint: disable=not-callable 2018-03-26 09:12:58,605Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_appsflyer/init.py", line 421, in sync_in_app_events 2018-03-26 09:12:58,605Z tap - record = xform(row, schema) 2018-03-26 09:12:58,605Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_appsflyer/init.py", line 91, in xform 2018-03-26 09:12:58,605Z tap - return transform.transform(record, schema) 2018-03-26 09:12:58,605Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/singer/transform.py", line 112, in transform 2018-03-26 09:12:58,606Z tap - raise Exception("Errors at paths {} in data {} for schema {}".format(error_paths, data, schema)) 2018-03-26 09:12:58,606Z tap - Exception: Errors at paths [['contributor2_touch_type'], ['af_cost_value'], ['af_ad_type'], ['af_sub3'], ['advertising_id'], ['contributor3_media_source'], ['contributor1_af_prt'], ['contributor1_touch_time'], ['contributor1_touch_type'], ['af_cost_currency'], ['af_siteid'], ['contributor1_campaign'], ['af_sub2'], ['contributor3_af_prt'], ['http_referrer'], ['contributor3_touch_time'], ['imei'], ['af_keywords'], ['carrier'], ['af_sub5'], ['af_sub4'], ['original_url'], ['contributor2_media_source'], ['af_sub_siteid'], ['af_cost_model'], ['android_id'], ['af_sub1'], ['contributor2_campaign'], ['customer_user_id'], ['retargeting_conversion_type'], ['contributor3_campaign'], ['operator'], ['contributor1_media_source'], ['af_prt'], ['contributor2_touch_time'], ['contributor2_af_prt'], ['af_reengagement_window'], ['contributor3_touch_type']] in data {'contributor2_touch_type': None, 'af_cost_value': None, 'af_ad_type': None, 'af_sub3': None, 'is_primary_attribution': 'true', 'region': 'AS', 'idfa': 'ED243952-62D2-4D37-A3A3-3E872468EFC9', 'advertising_id': None, 'city': 'As Sabkhah', 'contributor3_media_source': None, 'af_c_id': '6080247525522', 'contributor1_af_prt': None, 'attributed_touch_type': 'impression', 'contributor1_touch_time': None, 'os_version': '11.2.6', 'is_retargeting': False, 'ip': '5.31.232.114', 'event_revenue_currency': 'EUR', 'contributor1_touch_type': None, 'af_cost_currency': None, 'app_version': '3.5.3', 'dma': 'None', 'device_type': 'iPhone X', 'app_name': 'Bsit, the childcare app', 'af_siteid': None, 'contributor1_campaign': None, 'event_value': '{"af_revenue":1.75,"af_event_start":"2018-03-01T07:00:40.555Z","af_longitude":4.4780842999999777,"af_customer_user_id":"5a5eb9c9bc9b350014690a5a","af_event_end":"2018-03-01T11:00:40.555Z","af_latitude":50.828281099999991,"af_order_id":"5a89f26b55e4760014096c6e","af_price":28.84,"af_currency":"EUR"}', 'af_sub2': None, 'contributor3_af_prt': None, 'http_referrer': None, 'contributor3_touch_time': None, 'imei': None, 'event_source': 'SDK', 'af_keywords': None, 'carrier': None, 'af_ad_id': '6080247526522', 'af_sub5': None, 'af_adset': 'AllParents_All_en - Instagram', 'af_channel': 'Instagram', 'attributed_touch_time': '2018-01-16 17:50:15', 'af_adset_id': '6080247527722', 'install_time': '2018-01-17 02:49:27', 'user_agent': 'Bsit/3.5.3 CFNetwork/894 Darwin/17.4.0', 'af_sub4': None, 'appsflyer_id': '1516160951593-9081132', 'original_url': None, 'contributor2_media_source': None, 'af_sub_siteid': None, 'bundle_id': 'com.togetair.aribsit', 'af_cost_model': None, 'event_revenue': '1.75', 'wifi': False, 'android_id': None, 'idfv': 'BBC032A0-403A-479C-B0CA-7FA8801CD437', 'event_revenue_usd': '2.133962', 'af_sub1': None, 'postal_code': 'None', 'media_source': 'Facebook Ads', 'af_attribution_lookback': '1d', 'contributor2_campaign': None, 'country_code': 'AE', 'customer_user_id': '5a5eb9c9bc9b350014690a5a', 'retargeting_conversion_type': None, 'is_receipt_validated': None, 'contributor3_campaign': None, 'event_time': '2018-03-01 07:10:36', 'operator': None, 'campaign': 'Boost_97%_Parents&Sitters_All___SPLIT DB', 'contributor1_media_source': None, 'app_id': 'id1022393446', 'state': 'DU', 'af_prt': None, 'contributor2_touch_time': None, 'contributor2_af_prt': None, 'platform': 'ios', 'sdk_version': 'v4.5.9', 'language': 'en-BE', 'af_reengagement_window': None, 'contributor3_touch_type': None, 'event_name': 'af_finished', 'af_ad': 'Publication Dessin afterschool_en'} for schema {'type': 'object', 'properties': {'contributor2_touch_type': {'type': ['string', 'null']}, 'af_cost_value': {'type': ['string', 'null']}, 'is_retargeting': {'type': ['boolean', 'null']}, 'af_sub3': {'type': ['string', 'null']}, 'is_primary_attribution': {'type': ['boolean', 'null']}, 'region': {'type': ['string', 'null']}, 'idfa': {'type': ['string', 'null']}, 'imei': {'type': ['string', 'null']}, 'event_time': {'type': ['string', 'null'], 'format': 'date-time'}, 'city': {'type': ['string', 'null']}, 'af_c_id': {'type': ['string', 'null']}, 'af_ad_type': {'type': ['string', 'null']}, 'attributed_touch_type': {'type': ['string', 'null']}, 'contributor1_touch_time': {'type': ['string', 'null']}, 'os_version': {'type': ['string', 'null']}, 'ip': {'type': ['string', 'null']}, 'app_id': {'type': ['string', 'null']}, 'postal_code': {'type': ['string', 'null']}, 'contributor1_touch_type': {'type': ['string', 'null']}, 'af_cost_currency': {'type': ['string', 'null']}, 'app_version': {'type': ['string', 'null']}, 'dma': {'type': ['string', 'null']}, 'device_type': {'type': ['string', 'null']}, 'app_name': {'type': ['string', 'null']}, 'contributor2_media_source': {'type': ['string', 'null']}, 'contributor1_campaign': {'type': ['string', 'null']}, 'event_value': {'type': ['string', 'null']}, 'af_sub2': {'type': ['string', 'null']}, 'contributor3_af_prt': {'type': ['string', 'null']}, 'http_referrer': {'type': ['string', 'null'], 'format': 'uri'}, 'contributor3_touch_time': {'type': ['string', 'null']}, 'sdk_version': {'type': ['string', 'null']}, 'af_keywords': {'type': ['string', 'null']}, 'is_receipt_validated': {'type': ['boolean', 'null']}, 'af_ad_id': {'type': ['string', 'null']}, 'af_sub5': {'type': ['string', 'null']}, 'af_adset': {'type': ['string', 'null']}, 'af_channel': {'type': ['string', 'null']}, 'attributed_touch_time': {'type': ['string', 'null'], 'format': 'date-time'}, 'af_adset_id': {'type': ['string', 'null']}, 'install_time': {'type': ['string', 'null'], 'format': 'date-time'}, 'user_agent': {'type': ['string', 'null']}, 'af_sub4': {'type': ['string', 'null']}, 'appsflyer_id': {'type': ['string', 'null']}, 'event_source': {'type': ['string', 'null']}, 'event_revenue_currency': {'type': ['string', 'null']}, 'event_revenue': {'type': ['string', 'null']}, 'af_siteid': {'type': ['string', 'null']}, 'af_sub_siteid': {'type': ['string', 'null']}, 'bundle_id': {'type': ['string', 'null']}, 'af_cost_model': {'type': ['string', 'null']}, 'original_url': {'type': ['string', 'null'], 'format': 'uri'}, 'wifi': {'type': ['boolean', 'null']}, 'android_id': {'type': ['string', 'null']}, 'idfv': {'type': ['string', 'null']}, 'event_revenue_usd': {'type': ['string', 'null']}, 'af_sub1': {'type': ['string', 'null']}, 'advertising_id': {'type': ['string', 'null']}, 'media_source': {'type': ['string', 'null']}, 'af_attribution_lookback': {'type': ['string', 'null']}, 'contributor2_campaign': {'type': ['string', 'null']}, 'country_code': {'type': ['string', 'null']}, 'customer_user_id': {'type': ['integer', 'null']}, 'retargeting_conversion_type': {'type': ['string', 'null']}, 'event_name': {'type': ['string', 'null']}, 'contributor3_campaign': {'type': ['string', 'null']}, 'contributor3_media_source': {'type': ['string', 'null']}, 'operator': {'type': ['string', 'null']}, 'campaign': {'type': ['string', 'null']}, 'contributor1_media_source': {'type': ['string', 'null']}, 'state': {'type': ['string', 'null']}, 'af_prt': {'type': ['string', 'null']}, 'contributor2_touch_time': {'type': ['string', 'null']}, 'contributor2_af_prt': {'type': ['string', 'null']}, 'contributor1_af_prt': {'type': ['string', 'null']}, 'platform': {'type': ['string', 'null']}, 'language': {'type': ['string', 'null']}, 'af_reengagement_window': {'type': ['string', 'null']}, 'contributor3_touch_type': {'type': ['string', 'null']}, 'carrier': {'type': ['string', 'null']}, 'af_ad': {'type': ['string', 'null']}}} 2018-03-26 09:12:58,638Z target - INFO Exiting normally 2018-03-26 09:12:58,682Z main - INFO Target exited normally with status 0 2018-03-26 09:12:58,684Z main - INFO Exit status is: Tap failed with code 1. Target succeeded.

KAllan357 commented 6 years ago

Hi @BjornWesker - I'm not 100% sure what's wrong here. One thing I noticed was that the customer_user_id is listed in the schema as an integer or null but is coming through as a String.

Adding that type to the schema should at least move this forward a little.

BjornWesker commented 6 years ago

Hey @KAllan357
Thx for your answer! We are trying a few things to solve te issue. I will let you know how it goes ;-)

gautiermorel commented 6 years ago

Hi @KAllan357, i'd like to know how we can edit our schema without edit "tap-appsflyer" project ? Because the customer_user_id type is defined in this file in_app_events.json right ? (We use this tap through Stitch ETL). Thanks ;)

KAllan357 commented 6 years ago

@gautiermorel You could deselect that row in the Stitch UI and see if the tap works correctly. Doing something like that would isolate customer_user_id as the culprit.

If that's the case, the next step would be to edit the schema as you point out and make a pull request.

BjornWesker commented 6 years ago

Thx for helping solving this issue!