voxmedia / tap-instagram

Singer Tap for the Instagram Graph API
Apache License 2.0
5 stars 6 forks source link

Split out streams for 'lifetime' time periods #15

Closed acarter24 closed 7 months ago

acarter24 commented 1 year ago

Fixing issue with certain streams failing looking for an end_time when it was not present.

Also adds end_time to the PK list for Period stream to allow for collection of historical periods, otherwise the replication is on id only and overwrites last periods values.

Resolves #14 #13

prratek commented 1 year ago

Thanks @acarter24! This approach broadly makes sense to me and we can get around to reviewing and merging in the next couple of weeks. If you have steps to reproduce the SQLite error in particular that would make it easier for us to test that it resolves that issue (and doesn't break our usage with target-bigquery in any way)

acarter24 commented 1 year ago

Cheers. In terms of reproducing the error, I pretty much just downloaded the tap and ran it and got the error.

meltano, version 2.16.1 python 3.10

Would the json payload from the API be useful?

prratek commented 1 year ago

Nope, shouldn't need the payload! Other info that would be helpful:

acarter24 commented 1 year ago

To recreate:

Loaders: https://github.com/MeltanoLabs/target-sqlite and https://github.com/jwills/target-duckdb both show the issue

Stream selection to narrow it down to failing stream only

    select:
    - user_insights_audience.*

NB if I invert the filter, clear state etc, it completes without issue :

    select:
    - !user_insights_audience.*
    - *.*

EDIT:Just FYI I was also able to solve one part of the issue by overriding key-properties in meltano.yml for the relevant tables

  - name: tap-instagram
    variant: voxmedia
    pip_url: tap-instagram
    config:
      ig_user_ids: ${TAP_INSTAGRAM_IG_USER_IDS}
      access_token: ${TAP_INSTAGRAM_ACCESS_TOKEN}
      start_date: 2023-03-01
    metadata:
      user_insights_*:
        key-properties: ["end_time", "id"]
    select:
    - '*.*'
    - '!user_insights_audience.*'