singer-io / tap-appsflyer

A Singer.io tap for extracting data from the AppsFlyer API
GNU Affero General Public License v3.0
11 stars 40 forks source link

Installation Rpeorts Schema Error #14

Closed vikash6451 closed 5 years ago

vikash6451 commented 5 years ago

Traceback (most recent call last): File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/bin/tap-appsflyer", line 11, in load_entry_point('tap-appsflyer==0.0.11', 'console_scripts', 'tap-appsflyer')() File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/lib/python3.5/site-packages/tap_appsflyer/init.py", line 462, in main do_sync() File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/lib/python3.5/site-packages/tap_appsflyer/init.py", line 444, in do_sync stream.sync() # pylint: disable=not-callable File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/lib/python3.5/site-packages/tap_appsflyer/init.py", line 280, in sync_installs record = xform(row, schema) File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/lib/python3.5/site-packages/tap_appsflyer/init.py", line 95, in xform return transform.transform(record, schema) File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/lib/python3.5/site-packages/singer/transform.py", line 112, in transform raise Exception("Errors at paths {} in data {} for schema {}".format(error_paths, data, schema)) Exception: Errors at paths [['af_c_id'], ['customer_user_id'], ['af_ad_type'], ['contributor1_touch_type'], ['af_sub5'], ['af_sub1'], ['af_keywords'], ['af_sub3'], ['af_cost_currency'], ['af_adset_id'], ['contributor3_media_source'], ['af_reengagement_window'], ['af_ad'], ['contributor3_campaign'], ['af_ad_id'], ['contributor2_af_prt'], ['idfv'], ['event_revenue_usd'], ['contributor1_touch_time'], ['retargeting_conversion_type'], ['contributor1_media_source'], ['contributor2_touch_time'], ['contributor3_touch_type'], ['contributor3_af_prt'], ['attributed_touch_time'], ['contributor2_campaign'], ['af_cost_value'], ['af_sub2'], ['event_revenue_currency'], ['idfa'], ['af_sub4'], ['http_referrer'], ['contributor2_touch_type'], ['af_sub_siteid'], ['contributor1_af_prt'], ['af_prt'], ['contributor2_media_source'], ['af_cost_model'], ['contributor1_campaign'], ['contributor3_touch_time'], ['event_time'], ['event_value'], ['event_revenue'], ['af_channel'], ['android_id'], ['install_time']] in data {'is_receipt_validated': None, 'af_c_id': None, 'customer_user_id': None, 'event_source': 'SDK', 'af_ad_type': None,

dmosorast commented 5 years ago

Hi @vikash6451 This seems to be an issue with the data received not matching the JSON schema. I notice that some of the pieces of data at the end have the value None, so for example, customer_user_id should actually be marked nullable in the JSON Schema, in this case.

Also, it looks like this tap is still on a very old version of singer-python. If you bump the version to latest (5.3.3) does it work successfully? or at least give a better error message?

vikash6451 commented 5 years ago

I am sorry. But at the risk of sounding ignorant, How do I do that? I am actually trying to replicate the method outlined here by replacing the tap and targets of my choice.

dmosorast commented 5 years ago

No problem! In order to make a local change and run it, you'll need to install the tap into your virtual environment from the source code instead of from the PyPI repository (the method step 2 uses in that guide). So, instead of step 2 to install the tap, you can:

  1. From the command line, pull the code with git clone https://github.com/singer-io/tap-appsflyer.git, some more info on that here in the Github docs.
  2. Change directory into the root of the directory that you cloned (should be called tap-appsflyer).
  3. Edit the setup.py file to say singer-python==5.3.3 instead of singer-python==1.6.0a2
  4. Install the tap into your active virtual environment with pip install -e . The -e meaning "for editing" and the . meaning "my current directory"

Now when you run tap-appsflyer --config ..., it will run from the local code. If that fixes the issue, or gives you a more informative message, that would be helpful!

vikash6451 commented 5 years ago

Thanks so much for the step by step instructions. Really appreciate that. However, I ran into this new problem now.

(tap-appsflyer) (singer) D11MKTG002:~ vikash.kumar$ tap-appsflyer --config ... Traceback (most recent call last): File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/bin/tap-appsflyer", line 11, in <module> load_entry_point('tap-appsflyer', 'console_scripts', 'tap-appsflyer')() File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/lib/python3.5/site-packages/pkg_resources/__init__.py", line 565, in load_entry_point return get_distribution(dist).load_entry_point(group, name) File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/lib/python3.5/site-packages/pkg_resources/__init__.py", line 2631, in load_entry_point return ep.load() File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/lib/python3.5/site-packages/pkg_resources/__init__.py", line 2291, in load return self.resolve() File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/lib/python3.5/site-packages/pkg_resources/__init__.py", line 2297, in resolve module = __import__(self.module_name, fromlist=['__name__'], level=0) File "/Users/vikash.kumar/tap-appsflyer/tap_appsflyer/__init__.py", line 14, in <module> import singer.stats ImportError: No module named 'singer.stats'

dmosorast commented 5 years ago

Ah, that module was change from stats to metrics in more recent versions. I created a branch where that's fixed, can you take a look at that? So we can see about getting down to the real issue.

To get it, you can use git pull && git checkout feature/upgrade-singer-python then reinstall with pip install -e ..

vikash6451 commented 5 years ago

Some new issue now

Traceback (most recent call last): File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/bin/tap-appsflyer", line 11, in <module> load_entry_point('tap-appsflyer', 'console_scripts', 'tap-appsflyer')() File "/Users/vikash.kumar/tap-appsflyer/tap_appsflyer/__init__.py", line 462, in main do_sync() File "/Users/vikash.kumar/tap-appsflyer/tap_appsflyer/__init__.py", line 444, in do_sync stream.sync() # pylint: disable=not-callable File "/Users/vikash.kumar/tap-appsflyer/tap_appsflyer/__init__.py", line 271, in sync_installs request_data = request(url, params) File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/lib/python3.6/site-packages/backoff.py", line 286, in retry ret = target(*args, **kwargs) File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/lib/python3.6/site-packages/singer/utils.py", line 95, in wrapper return func(*args, **kwargs) File "/Users/vikash.kumar/tap-appsflyer/tap_appsflyer/__init__.py", line 142, in request with singer.metrics.http_request_timer(source=parse_source_from_url(url)) as timer: TypeError: http_request_timer() got an unexpected keyword argument 'source'

dmosorast commented 5 years ago

Sorry about that, I overlooked a couple of changes there. If you git pull it should run through that part.

vikash6451 commented 5 years ago

Some progress but new error again

INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 387.269246339798, "tags": {"endpoint": "installs", "http_status_code": 200, "status": "succeeded"}} Traceback (most recent call last): File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/bin/tap-appsflyer", line 11, in <module> load_entry_point('tap-appsflyer', 'console_scripts', 'tap-appsflyer')() File "/Users/vikash.kumar/tap-appsflyer/tap_appsflyer/__init__.py", line 462, in main do_sync() File "/Users/vikash.kumar/tap-appsflyer/tap_appsflyer/__init__.py", line 444, in do_sync stream.sync() # pylint: disable=not-callable File "/Users/vikash.kumar/tap-appsflyer/tap_appsflyer/__init__.py", line 280, in sync_installs record = xform(row, schema) File "/Users/vikash.kumar/tap-appsflyer/tap_appsflyer/__init__.py", line 95, in xform return transform.transform(record, schema) AttributeError: 'function' object has no attribute 'transform'

dmosorast commented 5 years ago

Just pushed a change for that one. It seems the interface changed for the transformation. You can try pulling and hopefully that'll get it to the actual issue with the data.

vikash6451 commented 5 years ago

We are down to the actual issue now. Date Time formating

Traceback (most recent call last): File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/bin/tap-appsflyer", line 11, in <module> load_entry_point('tap-appsflyer', 'console_scripts', 'tap-appsflyer')() File "/Users/vikash.kumar/tap-appsflyer/tap_appsflyer/__init__.py", line 461, in main do_sync() File "/Users/vikash.kumar/tap-appsflyer/tap_appsflyer/__init__.py", line 443, in do_sync stream.sync() # pylint: disable=not-callable File "/Users/vikash.kumar/tap-appsflyer/tap_appsflyer/__init__.py", line 282, in sync_installs if utils.strptime(record["attributed_touch_time"]) > bookmark: File "/Users/vikash.kumar/.virtualenvs/tap-appsflyer/lib/python3.6/site-packages/singer/utils.py", line 58, in strptime return datetime.datetime.strptime(dtime, DATETIME_PARSE) File "/anaconda3/lib/python3.6/_strptime.py", line 565, in _strptime_datetime tt, fraction = _strptime(data_string, format) File "/anaconda3/lib/python3.6/_strptime.py", line 362, in _strptime (data_string, format)) ValueError: time data '2018-11-28T13:46:32.000000Z' does not match format '%Y-%m-%dT%H:%M:%SZ'

dmosorast commented 5 years ago

Great! The error messages are a lot more reasonable in the latest singer-python. You should be able to work through the data issues at this point from a fork of this branch. If you run into any troubles, the folks in the Singer slack channel should be able to help out, they'll be more familiar with this version of the library. Feel free to submit a PR back if you get it working with your data!

It looks like this tap does a lot of datetime manipulation, so here are some resources for the latest standard for datetime translation, in the singer-python utils module:

For parsing, taps should use utils.strptime_to_utc and for formatting to string, taps should use utils.strftime. That should ensure that the translation works as expected.