singer-io / tap-typeform

Singer.io tap for extracting TypeForm data
GNU Affero General Public License v3.0
11 stars 20 forks source link

Landings & Answers tables don't load #28

Open phil-kramer opened 3 years ago

phil-kramer commented 3 years ago

Repeating an issue that was raised here once already: only my Questions table is extracting and loading, despite there being new answers and landing events. I noticed this last week and reset the integration, which worked to recover all the historical Landings and Answers data, only for the problem to recur again days later. Is there a fix for this?

Screen Shot 2021-04-04 at 12 07 59 PM Screen Shot 2021-04-04 at 12 07 51 PM
phil-kramer commented 3 years ago

Further color here: my answers table loaded during the reset, and then never refreshed or loaded new data again after that.

Screen Shot 2021-04-04 at 12 16 02 PM
happyherp commented 3 years ago

We have the same problem. Questions keep getting updated, even if there are no new rows. Answers and Landing get no rows at all. This seems to have started since 20. of April.

moseleyi commented 3 years ago

I have the same problem, any fix in the making? It makes the data unreliable

anchal-agarwal commented 3 years ago

I have the same issue. This is what I discovered. Hope this helps.

I'm using tap-typeform v1.2.0

Scenario:

  1. Increment is set to hourly.
  2. I provided the start date (2021-06-23T00:00:00Z) but not the end date.
  3. The state has date_to_resume at 2021-06-23 09:00:00
  4. A typeform response is submitted at 2021-06-23 09:00:04
  5. The tap is executed at 2021-06-23 09:00:28

Log:

2021-06-23 09:00:28 - start_date: 2021-06-23T00:00:00+00:00 2021-06-23 09:00:28 - end_date: 2021-06-23T09:00:00+00:00 2021-06-23 09:00:28 - last_date: 2021-06-23T09:00:00+00:00 2021-06-23 09:00:28 - ut_current_date: 1624438800 2021-06-23 09:00:28 - ut_next_date: 1624442400 2021-06-23 09:00:28 - Forms query - form: MYFORM start_date: 2021-06-23 09:00 end_date: 2021-06-23 10:00 2021-06-23 09:00:28 - raw data items= 1

Problem:

next_date is set to 2021-06-23 10:00:00 i.e. current_date + 1hr, which is greater than the end_date. The new date_to_resume for state is now set to next_date. So in the next tap run, any responses submitted between 2021-06-23 09:00:28 and 2021-06-23 10:00:00 are not picked up.

This became a problem for all subsequent runs (scheduled hourly) thereby not picking up any submitted responses.

Solution:

I think replacing line number 251 in streams.py (v1.2.0) while current_date <= end_date: with while current_date < end_date: may solve the problem. No responses will be skipped I think.

anchal-agarwal commented 3 years ago

The solution I gave above worked. All responses submitted before end_date are picked up in the current tap run. Any responses submitted after the end_date are picked up in the next run (after end_date increments). So I am now not missing any responses.

Hope this issue is fixed with the next release so I can use it and remove the local copy.

hespelz commented 3 years ago

Issue #25 from Aug 2020 looks like it's referencing the same problem.

Looks like it's line 283 now. Came here from Stitch's support for the same issue. I've tried resetting the integration, but still only get the questions table.

The extraction is perpetually "In Progress", returning 0 rows (You can see it started at 3pm and the screenshot was taken at almost 6pm).

Screenshot from 2021-07-02 17-49-36

yinafoodles commented 3 years ago

Same issue for my part as well. I can get responses with "curl" command and my token which means it's really a Stitch's issue. It worked only when you do a full sync which is really problematic.. Otherwise only the questions are refreshed. Stitch support propose no solution but redirect me here.. Did anyone find a solution?