airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.62k stars 4.02k forks source link

MixPanel Connection #5880

Closed delhora closed 3 years ago

delhora commented 3 years ago

Enviroment

Current Behavior

Error when sync to MixPanel, after doing the checking of the Source and destination, it fails with Failed to fetch schema error

Expected Behavior

Data should be synced to BigQuery

Logs

LOG ``` 2021-09-07 13:30:07 INFO () TemporalAttemptExecution(get):111 - Executing worker wrapper. Airbyte version: 0.29.15-alpha 2021-09-07 13:30:07 INFO () LogClientSingleton(setJobMdc):146 - Setting docker job mdc 2021-09-07 13:30:07 INFO () LineGobbler(voidCall):85 - Checking if airbyte/source-mixpanel:0.1.0 exists... 2021-09-07 13:30:07 INFO () LineGobbler(voidCall):85 - airbyte/source-mixpanel:0.1.0 was found locally. 2021-09-07 13:30:07 INFO () DockerProcessFactory(create):146 - Preparing command: docker run --rm --init -i -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -w /data/311d551a-9741-49c4-9155-d6a686ae56ad/0 --network host --log-driver none airbyte/source-mixpanel:0.1.0 discover --config source_config.json 2021-09-07 13:30:09 INFO () DefaultAirbyteStreamFactory(internalLog):110 - Using start_date: 2021-11-06, end_date: 2021-09-07 2021-09-07 13:30:09 ERROR () DefaultAirbyteStreamFactory(internalLog):108 - Stream engage_schema: 400 Bad Request - {"request": "/api/2.0/engage/properties", "error": "engage-distributed-query server error: 400 \nfor params: {'project_id': 2194296, 'limit': 2047, 'type': 'properties', 'properties_query_method': 'top_properties', 'query_id': '9dde25178fbd490181303bcef839264a', 'timezone': 'US/Pacific', 'enable_group_derived_files': True, 'enable_engage_query_derived_files': False, 'enable_arb_query_derived_files': False, 'enable_id_mapping': True, 'enable_async_shuffle': True}"} 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - Traceback (most recent call last): 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/airbyte/integration_code/main.py", line 33, in 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - launch(source, sys.argv[1:]) 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/usr/local/lib/python3.7/site-packages/airbyte_cdk/entrypoint.py", line 117, in launch 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - for message in source_entrypoint.run(parsed_args): 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/usr/local/lib/python3.7/site-packages/airbyte_cdk/entrypoint.py", line 102, in run 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - catalog = self.source.discover(logger, config) 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/usr/local/lib/python3.7/site-packages/airbyte_cdk/sources/abstract_source.py", line 78, in discover 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - streams = [stream.as_airbyte_stream() for stream in self.streams(config=config)] 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/usr/local/lib/python3.7/site-packages/airbyte_cdk/sources/abstract_source.py", line 78, in 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - streams = [stream.as_airbyte_stream() for stream in self.streams(config=config)] 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/usr/local/lib/python3.7/site-packages/airbyte_cdk/sources/streams/core.py", line 80, in as_airbyte_stream 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - stream = AirbyteStream(name=self.name, json_schema=dict(self.get_json_schema()), supported_sync_modes=[SyncMode.full_refresh]) 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/airbyte/integration_code/source_mixpanel/source.py", line 527, in get_json_schema 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - for property_entry in schema_properties: 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/usr/local/lib/python3.7/site-packages/airbyte_cdk/sources/streams/http/http.py", line 240, in read_records 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - response = self._send_request(request, request_kwargs) 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/airbyte/integration_code/source_mixpanel/source.py", line 98, in _send_request 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - raise e 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/airbyte/integration_code/source_mixpanel/source.py", line 93, in _send_request 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - return super()._send_request(request, request_kwargs) 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/usr/local/lib/python3.7/site-packages/backoff/_sync.py", line 94, in retry 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - ret = target(*args, **kwargs) 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/usr/local/lib/python3.7/site-packages/backoff/_sync.py", line 94, in retry 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - ret = target(*args, **kwargs) 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/usr/local/lib/python3.7/site-packages/airbyte_cdk/sources/streams/http/http.py", line 216, in _send_request 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - response.raise_for_status() 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 953, in raise_for_status 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - raise HTTPError(http_error_msg, response=self) 2021-09-07 13:30:09 ERROR () LineGobbler(voidCall):85 - requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://mixpanel.com/api/2.0/engage/properties 2021-09-07 13:30:09 INFO () TemporalAttemptExecution(lambda$getWorkerThread$2):155 - Completing future exceptionally... io.airbyte.workers.WorkerException: Discover job subprocess finished with exit code 1 at io.airbyte.workers.DefaultDiscoverCatalogWorker.run(DefaultDiscoverCatalogWorker.java:90) ~[io.airbyte-airbyte-workers-0.29.15-alpha.jar:?] at io.airbyte.workers.DefaultDiscoverCatalogWorker.run(DefaultDiscoverCatalogWorker.java:44) ~[io.airbyte-airbyte-workers-0.29.15-alpha.jar:?] at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:152) ~[io.airbyte-airbyte-workers-0.29.15-alpha.jar:?] at java.lang.Thread.run(Thread.java:832) [?:?] 2021-09-07 13:30:09 INFO () TemporalAttemptExecution(get):139 - Stopping cancellation check scheduling... 2021-09-07 13:30:09 WARN () POJOActivityTaskHandler$POJOActivityImplementation(execute):243 - Activity failure. ActivityId=a481db20-95e9-3378-a2a7-37e6387e63d1, activityType=Run, attempt=1 java.util.concurrent.ExecutionException: io.airbyte.workers.WorkerException: Discover job subprocess finished with exit code 1 at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) ~[?:?] at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2063) ~[?:?] at io.airbyte.workers.temporal.TemporalAttemptExecution.get(TemporalAttemptExecution.java:137) ~[io.airbyte-airbyte-workers-0.29.15-alpha.jar:?] at io.airbyte.workers.temporal.DiscoverCatalogWorkflow$DiscoverCatalogActivityImpl.run(DiscoverCatalogWorkflow.java:107) ~[io.airbyte-airbyte-workers-0.29.15-alpha.jar:?] at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?] at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.lang.reflect.Method.invoke(Method.java:564) ~[?:?] at io.temporal.internal.sync.POJOActivityTaskHandler$POJOActivityInboundCallsInterceptor.execute(POJOActivityTaskHandler.java:277) ~[temporal-sdk-1.0.4.jar:?] at io.temporal.internal.sync.POJOActivityTaskHandler$POJOActivityImplementation.execute(POJOActivityTaskHandler.java:216) ~[temporal-sdk-1.0.4.jar:?] at io.temporal.internal.sync.POJOActivityTaskHandler.handle(POJOActivityTaskHandler.java:181) ~[temporal-sdk-1.0.4.jar:?] at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:192) ~[temporal-sdk-1.0.4.jar:?] at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:154) ~[temporal-sdk-1.0.4.jar:?] at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:73) ~[temporal-sdk-1.0.4.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?] at java.lang.Thread.run(Thread.java:832) [?:?] Caused by: io.airbyte.workers.WorkerException: Discover job subprocess finished with exit code 1 at io.airbyte.workers.DefaultDiscoverCatalogWorker.run(DefaultDiscoverCatalogWorker.java:90) ~[io.airbyte-airbyte-workers-0.29.15-alpha.jar:?] at io.airbyte.workers.DefaultDiscoverCatalogWorker.run(DefaultDiscoverCatalogWorker.java:44) ~[io.airbyte-airbyte-workers-0.29.15-alpha.jar:?] at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:152) ~[io.airbyte-airbyte-workers-0.29.15-alpha.jar:?] ... 1 more ```

Steps to Reproduce

  1. Create MixPanel Source configuration
  2. Create BigQuery Destination configuration
  3. Link them and try to sync

NOTE After reading a bit, I found that some 400 errors are thrown because I'm using the "eu" mixpanel version, and european resident zone are using another host. Maybe its just a matter to be able to choose which source are you needed like in bigquery, (info here: https://developer.mixpanel.com/reference/overview)

sherifnada commented 3 years ago

@delhora will take a look soon!