singer-io / tap-pipedrive

A Singer.io tap for extracting data from the Pipedrive API
GNU Affero General Public License v3.0
13 stars 34 forks source link

Errors during transform #69

Open Thuran opened 4 years ago

Thuran commented 4 years ago

Hi, we got a error without any changes. The integrations works and broken alone. The log is:

2020-02-07 11:57:39,559Z tap - Errors during transform: [: [{'default_currency': 'BRL', 'is_you': False, 'locale': 'pt_BR', 'phone': None, 'active_flag': True, 'name': 'Name, 'icon_url': None, 'signup_flow_variation': 'invite', 'id': 10730570, 'is_admin': 0, 'timezone_name': 'America/Sao_Paulo', 'email': 'email@email.com', 'created': '2019-10-01 12:38:39', 'modified': '2020-02-06 16:01:56', 'timezone_offset': '-03:00', 'has_created_company': False, 'activated': True, 'role_id': 1, 'last_login': '2020-02-06 16:01:56', 'lang': 7}] does not match {'type': 'object', 'properties': {'id': {'type': 'integer'}}}] 2020-02-07 11:57:39,559Z tap - Traceback (most recent call last): 2020-02-07 11:57:39,559Z tap - File "tap-env/bin/tap-pipedrive", line 10, in 2020-02-07 11:57:39,559Z tap - sys.exit(main()) 2020-02-07 11:57:39,559Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_pipedrive/cli.py", line 34, in main 2020-02-07 11:57:39,559Z tap - raise e 2020-02-07 11:57:39,560Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_pipedrive/cli.py", line 31, in main 2020-02-07 11:57:39,560Z tap - main_impl() 2020-02-07 11:57:39,560Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_pipedrive/cli.py", line 26, in main_impl 2020-02-07 11:57:39,560Z tap - pipedrive_tap.do_sync(catalog) 2020-02-07 11:57:39,560Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_pipedrive/tap.py", line 204, in do_sync 2020-02-07 11:57:39,560Z tap - self.do_paginate(stream, stream_metadata) 2020-02-07 11:57:39,560Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_pipedrive/tap.py", line 250, in do_paginate 2020-02-07 11:57:39,560Z tap - row = optimus_prime.transform(row, stream.get_schema(), stream_metadata) 2020-02-07 11:57:39,560Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/singer/transform.py", line 127, in transform 2020-02-07 11:57:39,560Z tap - raise SchemaMismatch(self.errors) 2020-02-07 11:57:39,561Z tap - singer.transform.SchemaMismatch: Errors during transform 2020-02-07 11:57:39,561Z tap - : [{'default_currency': 'BRL', 'is_you': False, 'locale': 'pt_BR', 'phone': None, 'active_flag': True, 'name': 'Name, 'icon_url': None, 'signup_flow_variation': 'invite', 'id': 10730570, 'is_admin': 0, 'timezone_name': 'America/Sao_Paulo', 'email': 'email@email.com', 'created': '2019-10-01 12:38:39', 'modified': '2020-02-06 16:01:56', 'timezone_offset': '-03:00', 'has_created_company': False, 'activated': True, 'role_id': 1, 'last_login': '2020-02-06 16:01:56', 'lang': 7}] does not match {'type': 'object', 'properties': {'id': {'type': 'integer'}}}

dmosorast commented 4 years ago

Hi @Thuran Thanks for reporting this. It looks like Pipedrive may have changed the response structure of the endpoint the tap uses for the delete_log stream.

I don't have a Pipedrive account handy to test myself, but from looking over the code, it seems like the row object here is instead coming out as a list of rows. Given that a top-level JSON array isn't a valid Singer record, and the delete_log stream is a raw object schema, it seems like handling an array at this point as multiple rows would be acceptable.

Since I can't test this on my own, I'm adding the Help Wanted tag to this. If you or another community member is able to make a change, provide proof of it fixing the issue (e.g., redacted logs), and open a PR, I'd be glad to review it!

another-mmaia commented 4 years ago

I'm having the exact same issue. I tested the delete_log endpoint from Pipedrive and the response looks like:

{
  "success": true,
  "data": [
    {
      "item": "note",
      "id": 123456,
      "data": {
        "id": 98765,
        "user_id": 123987,
norbag commented 4 years ago

I am also having this issue since 7th of february.

Thuran commented 4 years ago

For now I disabled the tap in delete_log and the others taps worked

norbag commented 4 years ago

Anyone got an overview of what the consequences of this error is?

norbag commented 4 years ago

@Thuran I assume you dont use this tap via Stitch?

Thuran commented 4 years ago

Yes, i deactivated in replicated tables.

mihai-directimo commented 4 years ago

Hey guys. I've unchecked the "delete_log" table and I managed to bring data from Pipedrive to Power BI (after 2 weeks being down). But, for some reason, Power BI sees only the latest status for new leads that came in the period the integration was down (only 1 line / lead). Is there any possibility to bring all historic information, not have just the latest status?

LouDub commented 4 years ago

I have an error during transform too but i don't know if it comes from same reasons. Do you have any idea ? Thanks !

2020-10-02 09:40:50,218Z tap - INFO replicated 4 records from "delete_log" endpoint 2020-10-02 09:40:50,218Z tap - CRITICAL Errors during transform 2020-10-02 09:40:50,218Z tap - : [{'phone': None, 'active_flag': True, 'modified': '2020-10-02 07:36:11', 'id': 11570894, 'created': '2020-09-08 14:24:37', 'role_id': 1, 'name': 'Camille', 'lang': 8, 'timezone_offset': '+02:00', 'is_admin': 0, 'last_login': '2020-10-02 07:36:11', 'email': '', 'icon_url': 'https://d3myhnqlqw2314.cloudfront.net/profile_120x120_11570894_f9f01c873e92bf37b82c5219125e4b02.jpg', 'has_created_company': False, 'timezone_name': 'Europe/Paris', 'default_currency': 'EUR', 'is_you': False, 'activated': True, 'locale': 'fr_FR', 'signup_flow_variation': None}] does not match {'type': 'object', 'properties': {'id': {'type': 'integer'}}} 2020-10-02 09:40:50,218Z tap - 2020-10-02 09:40:50,218Z tap - 2020-10-02 09:40:50,218Z tap - Errors during transform: [: [{'phone': None, 'active_flag': True, 'modified': '2020-10-02 07:36:11', 'id': 11570894, 'created': '2020-09-08 14:24:37', 'role_id': 1, 'name': 'Camille', 'lang': 8, 'timezone_offset': '+02:00', 'is_admin': 0, 'last_login': '2020-10-02 07:36:11', 'email': '', 'icon_url': 'https://d3myhnqlqw2314.cloudfront.net/profile_120x120_11570894_f9f01c873e92bf37b82c5219125e4b02.jpg', 'has_created_company': False, 'timezone_name': 'Europe/Paris', 'default_currency': 'EUR', 'is_you': False, 'activated': True, 'locale': 'fr_FR', 'signup_flow_variation': None}] does not match {'type': 'object', 'properties': {'id': {'type': 'integer'}}}] 2020-10-02 09:40:50,218Z tap - Traceback (most recent call last): 2020-10-02 09:40:50,218Z tap - File "tap-env/bin/tap-pipedrive", line 33, in 2020-10-02 09:40:50,218Z tap - sys.exit(load_entry_point('tap-pipedrive==1.0.6', 'console_scripts', 'tap-pipedrive')()) 2020-10-02 09:40:50,218Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_pipedrive/cli.py", line 34, in main 2020-10-02 09:40:50,218Z tap - raise e 2020-10-02 09:40:50,218Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_pipedrive/cli.py", line 31, in main 2020-10-02 09:40:50,218Z tap - main_impl() 2020-10-02 09:40:50,218Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_pipedrive/cli.py", line 26, in main_impl 2020-10-02 09:40:50,218Z tap - pipedrive_tap.do_sync(catalog) 2020-10-02 09:40:50,219Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_pipedrive/tap.py", line 151, in do_sync 2020-10-02 09:40:50,219Z tap - self.do_paginate(stream, stream_metadata) 2020-10-02 09:40:50,219Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/tap_pipedrive/tap.py", line 197, in do_paginate 2020-10-02 09:40:50,219Z tap - row = optimus_prime.transform(row, stream.get_schema(), stream_metadata) 2020-10-02 09:40:50,219Z tap - File "/code/orchestrator/tap-env/lib/python3.5/site-packages/singer/transform.py", line 127, in transform 2020-10-02 09:40:50,219Z tap - raise SchemaMismatch(self.errors) 2020-10-02 09:40:50,219Z tap - singer.transform.SchemaMismatch: Errors during transform 2020-10-02 09:40:50,219Z tap - : [{'phone': None, 'active_flag': True, 'modified': '2020-10-02 07:36:11', 'id': 11570894, 'created': '2020-09-08 14:24:37', 'role_id': 1, 'name': 'Camille', 'lang': 8, 'timezone_offset': '+02:00', 'is_admin': 0, 'last_login': '2020-10-02 07:36:11', 'email': '', 'icon_url': 'https://d3myhnqlqw2314.cloudfront.net/profile_120x120_11570894_f9f01c873e92bf37b82c5219125e4b02.jpg', 'has_created_company': False, 'timezone_name': 'Europe/Paris', 'default_currency': 'EUR', 'is_you': False, 'activated': True, 'locale': 'fr_FR', 'signup_flow_variation': None}] does not match {'type': 'object', 'properties': {'id': {'type': 'integer'}}} 2020-10-02 09:40:50,219Z tap - 2020-10-02 09:40:50,219Z tap - 2020-10-02 09:40:50,219Z tap - Errors during transform: [: [{'phone': None, 'active_flag': True, 'modified': '2020-10-02 07:36:11', 'id': 11570894, 'created': '2020-09-08 14:24:37', 'role_id': 1, 'name': 'Camille', 'lang': 8, 'timezone_offset': '+02:00', 'is_admin': 0, 'last_login': '2020-10-02 07:36:11', 'email': '', 'icon_url': 'https://d3myhnqlqw2314.cloudfront.net/profile_120x120_11570894_f9f01c873e92bf37b82c5219125e4b02.jpg', 'has_created_company': False, 'timezone_name': 'Europe/Paris', 'default_currency': 'EUR', 'is_you': False, 'activated': True, 'locale': 'fr_FR', 'signup_flow_variation': None}] does not match {'type': 'object', 'properties': {'id': {'type': 'integer'}}}] 2020-10-02 09:40:50,257Z target - INFO Serializing batch with 5375 messages for table delete_log 2020-10-02 09:40:50,293Z target - INFO Sending batch of 415031 bytes to https://api.stitchdata.com/v2/import/batch 2020-10-02 09:40:50,765Z target - INFO Requests complete, stopping loop 2020-10-02 09:40:50,814Z main - INFO Target exited normally with status 0 2020-10-02 09:40:51,433Z main - INFO [smart-services] event successfully sent to kafka: com.stitchdata.extractionJobFinished [32] at offset None 2020-10-02 09:40:51,434Z main - INFO No tunnel subprocess to tear down 2020-10-02 09:40:51,435Z main - INFO Exit status is: Discovery succeeded. Tap failed with code 1 and error message: "Errors during transform". Target succeeded.

fcastellsflip commented 3 years ago

I'm having the same issue "ERRORS DURING TRANSFORM". Looking at the logs, it comes from delete_log endpoint:

 2020-11-12 14:30:40,420Z    tap - INFO replicated 99 records from "delete_log" endpoint
 2020-11-12 14:30:40,420Z    tap - CRITICAL Errors during transform
 2020-11-12 14:30:40,420Z    tap -  : [{'locale': 'en_US', 'email': 'xxxx@xxxx.com', 'is_you': False, 'id': 11311344, 
 'timezone_offset': '+00:00', 'last_login': '2020-11-12 08:57:20', 'is_admin': 0, 'has_created_company': False, 
 'timezone_name': 'Europe/Dublin', 'default_currency': 'EUR', 'lang': 1, 'modified': '2020-11-12 08:57:20', 'active_flag': True, 
 'role_id': 10, 'created': '2020-01-14 14:20:20', 'name': 'XXXXX XXXXX', 'phone': None, 'activated': True, 'icon_url': None, 
 'signup_flow_variation': 'invite'}] does not match {'type': 'object', 'properties': {'id': {'type': 'integer'}}}`
jerrydeng commented 3 years ago

I'm having the same issue "ERRORS DURING TRANSFORM" from delete_log endpoint as well

pastime28 commented 3 years ago

The issue persists when using the Stitch integration. I can confirm that turning on the delete_log field in the integration removes the issue. I'll do my best to get a PR created for this since the issue has been out here for some time.

happyherp commented 3 years ago

I also have the same problem. I am working on a fix....

KAllan357 commented 3 years ago

I looked into the Pipedrive documentation and noticed that the recents endpoint no longer lists delete_logs as a valid item. I'm beginning to believe that this data is no longer available from the Pipedrive API.

When I make a request to this endpoint and force the items query-param to delete_logs, I do get a 200 response but it is a mix of data. If I change the query-param to anything else not in the list, I get the same response - it does not seem like Pipedrive cares about "invalid" entries for that parameter.

Do you see the same behavior @happyherp? I'm leaning towards removing this stream as it no longer seems applicable.

happyherp commented 3 years ago

@KAllan357 I will look into this on Monday. You might be right. We just started using this tab, so I never actually saw it working.

But if it does not work anymore - does that mean we no longer get information about deletions at all? That would be bad, as deleted items would still be in our data warehouse.

KAllan357 commented 3 years ago

@happyherp I think that would be a question for Pipedrive. It does look like some objects like Deals and Organizations have a deleted and active_flag field respectively. The active_flag key appears in several of the tap's schemas. https://pipedrive.readme.io/docs/faq

happyherp commented 3 years ago

@KAllan357 you are right. The active flags are working. Maybe the delete_log can be removed altogether. Unfortunately other things came up and I could not spend more time on this. I will get back at a later date.

happyherp commented 3 years ago

@happyherp I think that would be a question for Pipedrive. It does look like some objects like Deals and Organizations have a deleted and active_flag field respectively. The active_flag key appears in several of the tap's schemas. https://pipedrive.readme.io/docs/faq

I can confirm that this is the case. I also say we should remove the stream.