Closed jim-barlow closed 3 years ago
Hi @jimbeepbeep thank you for this issue! Looks like a great use-case for Airbyte indeed. Let me contact you on slack so we can discuss how to move forward on this integration.
Great to catch up on this @michel-tricot, let me know if there's anything we can do to get it kicked off.
Seems like instagram docs have moved. new location here
As far as I can tell from looking at the API docs we should be able to do this once we have a few prerequisites:
then we can rock and roll.
That sounds about right @sherifnada, they make you do a merry little dance before you can get any data, and the error messages can be a little confusing if not downright misleading! One really great resource here is the Graph API Explorer, which is also the easiest place to generate an access token for testing too. In order to access the data from multiple Instagram Accounts (which is essential and would also be pretty unique to Airbyte), you need to generate a User Token with a set of permissions. I used the following (although in reality you probably don't need them all:
pages_manage_cta
pages_manage_instant_articles
pages_show_list
ads_management
business_management
instagram_basic
instagram_manage_comments
instagram_manage_insights
pages_read_engagement
pages_manage_metadata
pages_read_user_content
pages_manage_ads
pages_manage_posts
pages_manage_engagement
public_profile
As an example, the query I use to get all of the Instagram Accounts which I have access to, and which are linked to my client's Facebook Business Account (you can see this in the Facebook App Dashboard in the Basic -> Verification section) is:
FACEBOOK_BUSINESS_ACCOUNT/instagram_business_accounts?fields=id,ig_id,username&limit=500
Note that this is prepended with https://graph.facebook.com/{graph-api-version}/
i.e. https://graph.facebook.com/v9.0/
and appended with &access_token={access_token}
We had numerous issues figuring out the right setup to get this working so let me know if you have any questions! Also it's interesting to note that you can get a lot of data from field expansion which could minimise the number of endpoints you need to hit, you just need to process the paginated responses which I'm sure you guys are pretty pro at.
Thanks for the context @jimbeepbeep! We'll realistically be able to start work on this in the coming couple of weeks. We'll begin work on the account setup/data population etc next week
That's great @sherifnada, let me know if you need anything else. Our experience was that getting the accounts set up correctly was one of the biggest headaches and the Facebook/Instagram terminologies and requirements can be really confusing. If you're having any issues then please contact me and I can probably get you set up with access to a few shared accounts for testing purposes.
Scoping results for this task:
Having investigated the documentation of the Facebook Graph API related to Instagram, I can confirm the information that was indicated above, we can read the data for the following streams using Facebook Graph API:
I tried testing requests with an existing test business Facebook account from Airbyte, and really did not find any Instagram Account binding. I agree with the @sherifnada message, before starting the development - we need to link your Instagram account and fill it with data.
Assuming the second step is complete, my estimate of the time to implement this connector with rate limiting handling is 4 days.
Breakdown:
@yevhenii-ldv could you create tickets for the above breakdown and add to the connector roadmap?
As confirmed to Shrif on Slack, I can help with the test account as setting up all of the links is a bit of a nightmare! I'll DM the details there.
@jimbeepbeep small question on the UX for this connector: would you expect each instance of the connector to sync data for one instagram account, or would you want a single connector instance to pull data for multiple accounts?
by account do you mean credentials or user handle? if it is credentials, we need to be consistent with our definition of connector and it should only be one per connector instance.
@sherifnada that exactly the right question. A single set of credentials can have multiple accounts associated, and the precise problem we had with existing approaches was that they just worked for a single account. We definitely need multiple accounts but not multiple tables. Currently we stream the API response as JSON into BQ and then decode into a nested table using a custom function.
@jimbeepbeep Question about User Insights: which metric/period combinations make sense? My current inclination is to expose all of them under one table which has columns reflecting each possible metric/period combination. The schema of the table would look something like:
user_id
reach_1d
reach_7d
reach_28d
impressions_1d
impressions_7d
impression_28d
website_clicks
etc...
WDYT about this breakdown?
@sherifnada yes that sounds like a good approach.
@sherifnada how are you guys going on this? I need to update some of the metrics and accounts which my (hopefully stop-gap!) is pulling and wondered if it would be helpful for me to test anything with real data?
@jimbeepbeep I'd say we're ~85% done with the connector and expect to merge it in the next couple of days. Would that work for your schedule?
@sherifnada you guys are amazing, thanks!
@jimbeepbeep hi!
We just added this connector and merged it into master.
The connector will be available in the next release of Airbyte (we usually release every Tuesday). If you can't wait and want to get started now trying it out in a running Airbyte instance, add the connector like described in https://docs.airbyte.io/integrations/custom-connectors#adding-your-connectors-in-the-ui . The information for the connector is:
Display Name: Instagram API Docker repository name: airbyte/source-instagram Docker image tag: 0.1.0 Documentation URL: hub.docker.com/r/airbyte/source-instagram
Please let us know if you encounter any issues or have any questions.
Enjoy!
Amazing, thanks team. We'll get on testing this today!
OK @sherifnada I have added the connector according to your instructions above, which seem to work fine. However when I try and test the connection using the secret which we use in production (stored in Google Secrets Manager) I get the following error:
Failed Logs:
2021-03-16 09:35:26 INFO (/tmp/workspace/67/0) WorkerRun(call):62 - Executing worker wrapper. Airbyte version: AIRBYTE_VERSION
2021-03-16 09:35:26 INFO (/tmp/workspace/67/0) TemporalAttemptExecution(get):79 - Executing worker wrapper. Airbyte version: AIRBYTE_VERSION
2021-03-16 09:35:26 INFO (/tmp/workspace/67/0) LineGobbler(voidCall):69 - Checking if airbyte/source-instagram:0.1.0 exists...
2021-03-16 09:35:26 INFO (/tmp/workspace/67/0) LineGobbler(voidCall):69 - airbyte/source-instagram:0.1.0 was found locally.
2021-03-16 09:35:26 DEBUG (/tmp/workspace/67/0) DockerProcessBuilderFactory(create):104 - Preparing command: docker run --rm -i -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -w /data/67/0 --network host airbyte/source-instagram:0.1.0 check --config source_config.json
2021-03-16 09:35:27 ERROR (/tmp/workspace/67/0) DefaultAirbyteStreamFactory(internalLog):108 - Error: 190, Error validating access token: The session is invalid because the user logged out.
2021-03-16 09:35:27 ERROR (/tmp/workspace/67/0) DefaultAirbyteStreamFactory(internalLog):108 - Check failed
2021-03-16 09:35:28 DEBUG (/tmp/workspace/67/0) DefaultCheckConnectionWorker(run):93 - Check connection job subprocess finished with exit code 0
2021-03-16 09:35:28 DEBUG (/tmp/workspace/67/0) DefaultCheckConnectionWorker(run):94 - Check connection job received output: io.airbyte.config.StandardCheckConnectionOutput@1f8b6e17[status=failed,message=Error: 190, Error validating access token: The session is invalid because the user logged out.]
If I log into the Graph API Explorer and generate another User Token with these permissions :
- pages_manage_cta
- pages_manage_instant_articles
- pages_show_list
- ads_management
- business_management
- instagram_basic
- instagram_manage_comments
- instagram_manage_insights
- pages_read_engagement
- pages_manage_metadata
- pages_read_user_content
- pages_manage_ads
- pages_manage_posts
- pages_manage_engagement
- public_profile
'Testing Connection' takes a little bit longer, then I get the following error:
With no logs available to share. Let me know any steps you think I might need to debug!
I have checked through the server logs and I think I've found the issue. I'm not sure where to enter the account_id (and also which account_id it is, but that seems to be the root cause of the problem:
2021-03-16 09:46:20 INFO REQ 172.18.0.1 OPTIONS 200 /api/v1/sources/create
2021-03-16 09:46:20 DEBUG cache hit: airbyte/source-instagram:0.1.0
2021-03-16 09:46:20 DEBUG Known exception
io.airbyte.server.errors.KnownException: The provided configuration does not fulfill the specification. Errors: json schema validation failed.
errors: $.account_id: is missing but it is required
schema:
{
"type" : "object",
"title" : "Source Instagram",
"$schema" : "http://json-schema.org/draft-07/schema#",
"required" : [ "account_id", "access_token" ],
"properties" : {
"start_date" : {
"type" : "string",
"pattern" : "^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}Z$",
"examples" : [ "2020-09-25T00:00:00Z" ],
"description" : "The date from which you'd like to replicate data for User Insights, in the format YYYY-MM-DDT00:00:00Z. All data generated after this date will be replicated."
},
"access_token" : {
"type" : "string",
"description" : "The value of the access token generated. See the <a href=\"https://docs.airbyte.io/integrations/sources/instagram\">docs</a> for more information",
"airbyte_secret" : true
}
},
"additionalProperties" : false
}
object:
{
"start_date" : "2021-01-01T00:00:00Z",
"access_token" : "[JB_REDACTED]"
}
@jimbeepbeep hi!
Thank you very much for your attention and information provided.
You are absolutely right, the problem lies precisely with the account_id
.
I have already created an issue for your request and we will solve this problem in the near future (I think in the near hours). As soon as we update the connector, I will immediately inform you about it in the comments to this issue.
Thanks a lot!
@jimbeepbeep We have fixed the bug and now an updated version of the Instagram connector is available. Could you try new Instagram connector version and let us know if that works? ;)
Thanks @yevhenii-ldv - what do I need to do to get the revised version please? Is there an increment to the docker image tag or do I need to upgrade as per this guide? I have two instances running on different VMs on Google Cloud - the one I used previously still shows the connection I manually created (which still does not work with valid credentials), and I can't see Instagram as an option in either. I manually created another connection with 0.1.0 image tag and get the same error in the server logs:
Caused by: io.airbyte.validation.json.JsonValidationException: json schema validation failed.
errors: $.account_id: is missing but it is required
OK I created an entirely new VM and instance and the Instagram connector did show up, however it still fails. Posting the entire server log as there are a couple of references to Instagram in there:
___ _ __ __
/ | (_)____/ /_ __ __/ /____
/ /| | / / ___/ __ \/ / / / __/ _ \
/ ___ |/ / / / /_/ / /_/ / /_/ __/
/_/ |_/_/_/ /_.___/\__, /\__/\___/
/____/
--------------------------------------
Now ready at http://localhost:8000/
--------------------------------------
Version: 0.17.1-alpha
2021-03-17 06:19:34 INFO REQ 172.18.0.1 OPTIONS 200 /api/v1/workspaces/get
2021-03-17 06:19:36 INFO REQ 172.18.0.1 POST 200 /api/v1/workspaces/get - {"workspaceId":"5ae6b09b-fdec-41af-aaf7-7d94cfc33ef6"}
2021-03-17 06:19:36 INFO REQ 172.18.0.1 OPTIONS 200 /api/v1/web_backend/connections/list
2021-03-17 06:19:36 INFO REQ 172.18.0.1 POST 200 /api/v1/web_backend/connections/list - {"workspaceId":"5ae6b09b-fdec-41af-aaf7-7d94cfc33ef6"}
2021-03-17 06:19:36 INFO REQ 172.18.0.1 OPTIONS 200 /api/v1/source_definitions/list
2021-03-17 06:19:37 INFO REQ 172.18.0.1 POST 200 /api/v1/source_definitions/list - {"workspaceId":"5ae6b09b-fdec-41af-aaf7-7d94cfc33ef6"}
2021-03-17 06:19:41 INFO REQ 172.18.0.1 OPTIONS 200 /api/v1/sources/list
2021-03-17 06:19:41 INFO REQ 172.18.0.1 POST 200 /api/v1/sources/list - {"workspaceId":"5ae6b09b-fdec-41af-aaf7-7d94cfc33ef6"}
2021-03-17 06:19:54 INFO REQ 172.18.0.1 OPTIONS 200 /api/v1/source_definition_specifications/get
2021-03-17 06:19:54 DEBUG cache miss: airbyte/source-instagram:0.1.0
2021-03-17 06:19:54 INFO enqueuing pending job for scope: airbyte/source-instagram:0.1.0
2021-03-17 06:19:54 INFO Waiting for job id: 3
2021-03-17 06:20:01 INFO REQ 172.18.0.1 POST 200 /api/v1/source_definition_specifications/get - {"sourceDefinitionId":"6acf6b55-4f1e-4fca-944e-1a3caef8aba8"}
2021-03-17 06:20:31 INFO REQ 172.18.0.1 OPTIONS 200 /api/v1/scheduler/sources/check_connection
2021-03-17 06:20:32 INFO enqueuing pending job for scope:
2021-03-17 06:20:32 INFO Waiting for job id: 4
2021-03-17 06:20:48 INFO REQ 172.18.0.1 POST 200 /api/v1/scheduler/sources/check_connection - {"sourceDefinitionId":"6acf6b55-4f1e-4fca-944e-1a3caef8aba8","connectionConfiguration":"REDACTED"}
2021-03-17 06:20:48 INFO REQ 172.18.0.1 OPTIONS 200 /api/v1/sources/create
2021-03-17 06:20:48 DEBUG cache hit: airbyte/source-instagram:0.1.0
2021-03-17 06:20:48 WARN Unknown keyword examples - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2021-03-17 06:20:48 WARN Unknown keyword airbyte_secret - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2021-03-17 06:20:48 DEBUG Known exception
io.airbyte.server.errors.KnownException: The provided configuration does not fulfill the specification. Errors: json schema validation failed.
errors: $.account_id: is missing but it is required
Please let me know next steps. Also, can you please add @joseignaciorm to this issue as he'll be taking the lead on this.
@jimbeepbeep @joseignaciorm
apologies for the mix up on how to upgrade the connector here!
You'll need to upgrade your connector to version 0.1.1
to pick up the bugfix. To upgrade your connector version, go to the admin panel in the left hand side of the UI, find the instagram connector in the list, and input the latest connector version.
Please let us know if you have any feedback or issues.
Enjoy!
Thanks @sherifnada, @yevhenii-ldv and team, I've created a new (Instagram) connector from from 0.1.1 and I can confirm... it seems to work perfectly! I'll get to work transforming the data for our analytics and running some QA but it looks great from my initial inspection, and thanks for also including the JSON fields for the more weirdly structured metric responses. You guys have done a great job and (I believe) created the only way in the world of syncing data across multiple linked IG accounts in one single pipeline, with just one configuration, whilst also eliminating any process for new account onboarding. @michel-tricot and @johnlafleur your team rock!
Thanks @jimbeepbeep!
@jimbeepbeep our pleasure! So glad to hear it's useful :)
OK @sherifnada I've set this connection up on a 3 hourly sync and have had the chance to spend a bit more time with this data today. A lot looks right but there are a few issues in the data which we need to look at. As you know it can get a little confusing with the different id fields, but it's important to note that the id
field in the users
table corresponds to the business_account_id
in all of the other tables (where available).
QA for distinct counts:
Please let me know if there's anything I can do or any further information you need to address these issues.
@jimbeepbeep thanks for reporting the issue. Could you share the logs for the relevant jobs? We'll investigate on our side as well.
@jimbeepbeep I'm unable to reproduce a similar issue on my side. Let's hope logs can provide some insights!
@sherifnada I have re-run and include the logs below, also updated my previous comment as stories are ephemeral and they do show up in subsequent syncs. However the issue is still there with the media and media_insights:
Logs below, looks like the streams might be bombing out after an OAuthException error, not sure if that's something we can fix our end... let me know your thoughts.
2021-03-24 06:05:34 INFO (/tmp/workspace/58/0) WorkerRun(call):62 - Executing worker wrapper. Airbyte version: AIRBYTE_VERSION
2021-03-24 06:05:34 INFO (/tmp/workspace/58/0) TemporalAttemptExecution(get):79 - Executing worker wrapper. Airbyte version: AIRBYTE_VERSION
2021-03-24 06:05:34 INFO (/tmp/workspace/58/0) DefaultSyncWorker(run):86 - configured sync modes: {stories=full_refresh, user_lifetime_insights=full_refresh, media=full_refresh, story_insights=full_refresh, user_insights=incremental, users=full_refresh, media_insights=full_refresh}
2021-03-24 06:05:34 INFO (/tmp/workspace/58/0) DefaultAirbyteDestination(start):67 - Running target...
2021-03-24 06:05:34 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Checking if airbyte/destination-bigquery:0.2.0 exists...
2021-03-24 06:05:35 DEBUG (/tmp/workspace/58/0) DockerProcessBuilderFactory(create):104 - Preparing command: docker run --rm -i -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -w /data/58/0 --network host airbyte/destination-bigquery:0.2.0 write --config destination_config.json --catalog destination_catalog.json
2021-03-24 06:05:35 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - airbyte/destination-bigquery:0.2.0 was found locally.
2021-03-24 06:05:35 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Checking if airbyte/source-instagram:0.1.1 exists...
2021-03-24 06:05:35 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - airbyte/source-instagram:0.1.1 was found locally.
2021-03-24 06:05:35 DEBUG (/tmp/workspace/58/0) DockerProcessBuilderFactory(create):104 - Preparing command: docker run --rm -i -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -w /data/58/0 --network host airbyte/source-instagram:0.1.1 read --config source_config.json --catalog source_catalog.json --state input_state.json
2021-03-24 06:05:37 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:05:37 [32mINFO[m i.a.i.d.b.BigQueryDestination(main):390 - {} - starting destination: class io.airbyte.integrations.destination.bigquery.BigQueryDestination
2021-03-24 06:05:37 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:05:37 [32mINFO[m i.a.i.b.IntegrationRunner(run):78 - {} - Running integration: io.airbyte.integrations.destination.bigquery.BigQueryDestination
2021-03-24 06:05:37 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:05:37 [32mINFO[m i.a.i.b.IntegrationCliParser(parseOptions):135 - {} - integration args: {catalog=destination_catalog.json, write=null, config=destination_config.json}
2021-03-24 06:05:37 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:05:37 [32mINFO[m i.a.i.b.IntegrationRunner(run):82 - {} - Command: WRITE
2021-03-24 06:05:37 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:05:37 [32mINFO[m i.a.i.b.IntegrationRunner(run):83 - {} - Integration config: IntegrationConfig{command=WRITE, configPath='destination_config.json', catalogPath='destination_catalog.json', statePath='null'}
2021-03-24 06:05:39 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:05:39 [32mINFO[m i.a.i.d.b.BigQueryDestination(createTable):264 - {} - Table created successfully
2021-03-24 06:05:39 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:05:39 [32mINFO[m i.a.i.d.b.BigQueryDestination(createTable):264 - {} - Table created successfully
2021-03-24 06:05:40 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:05:40 [32mINFO[m i.a.i.d.b.BigQueryDestination(createTable):264 - {} - Table created successfully
2021-03-24 06:05:40 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:05:40 [32mINFO[m i.a.i.d.b.BigQueryDestination(createTable):264 - {} - Table created successfully
2021-03-24 06:05:40 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:05:40 [32mINFO[m i.a.i.d.b.BigQueryDestination(createTable):264 - {} - Table created successfully
2021-03-24 06:05:41 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:05:41 [32mINFO[m i.a.i.d.b.BigQueryDestination(createTable):264 - {} - Table created successfully
2021-03-24 06:05:41 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:05:41 [32mINFO[m i.a.i.d.b.BigQueryDestination(createTable):264 - {} - Table created successfully
2021-03-24 06:05:51 INFO (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):110 - Starting syncing SourceInstagram
2021-03-24 06:05:51 INFO (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):110 - Syncing media stream
2021-03-24 06:06:16 ERROR (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):108 - Encountered an exception while reading stream SourceInstagram
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/base_python/source.py", line 88, in read
yield from self._read_stream(logger=logger, client=client, configured_stream=configured_stream, state=total_state)
File "/usr/local/lib/python3.7/site-packages/base_python/source.py", line 106, in _read_stream
for record in client.read_stream(configured_stream.stream):
File "/usr/local/lib/python3.7/site-packages/base_python/client.py", line 166, in read_stream
for message in method(fields=fields):
File "/usr/local/lib/python3.7/site-packages/source_instagram/client/api.py", line 261, in list
yield clear_video_url(record_data)
File "/usr/local/lib/python3.7/site-packages/source_instagram/client/api.py", line 46, in clear_video_url
end_of_string = record_data["media_url"].find("&_nc_rid=")
KeyError: 'media_url'
2021-03-24 06:06:16 INFO (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):110 - Syncing media_insights stream
2021-03-24 06:06:37 ERROR (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):108 - Insights error: {'error': {'message': 'Invalid parameter', 'type': 'OAuthException', 'code': 100, 'error_data': {'blame_field_specs': [['']]}, 'error_subcode': 2108006, 'is_transient': False, 'error_user_title': 'Media posted before business account conversion', 'error_user_msg': "The media was posted before the most recent time that the user's account was converted to a business account from a personal account.", 'fbtrace_id': 'AsEo4TGbJgj894OcT7XaCFL'}}
2021-03-24 06:06:37 ERROR (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):108 - Encountered an exception while reading stream SourceInstagram
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/base_python/source.py", line 88, in read
yield from self._read_stream(logger=logger, client=client, configured_stream=configured_stream, state=total_state)
File "/usr/local/lib/python3.7/site-packages/base_python/source.py", line 106, in _read_stream
for record in client.read_stream(configured_stream.stream):
File "/usr/local/lib/python3.7/site-packages/base_python/client.py", line 166, in read_stream
for message in method(fields=fields):
File "/usr/local/lib/python3.7/site-packages/source_instagram/client/api.py", line 315, in list
**{record.get("name"): record.get("values")[0]["value"] for record in self._get_insights(ig_media)},
File "/usr/local/lib/python3.7/site-packages/backoff/_sync.py", line 94, in retry
ret = target(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/source_instagram/client/api.py", line 337, in _get_insights
raise error
File "/usr/local/lib/python3.7/site-packages/source_instagram/client/api.py", line 334, in _get_insights
return item.get_insights(params={"metric": metrics})
File "/usr/local/lib/python3.7/site-packages/facebook_business/adobjects/igmedia.py", line 249, in get_insights
return request.execute()
File "/usr/local/lib/python3.7/site-packages/facebook_business/api.py", line 677, in execute
cursor.load_next_page()
File "/usr/local/lib/python3.7/site-packages/facebook_business/api.py", line 844, in load_next_page
params=self.params,
File "/usr/local/lib/python3.7/site-packages/facebook_business/api.py", line 350, in call
raise fb_response.error()
facebook_business.exceptions.FacebookRequestError:
Message: Call was not successful
Method: GET
Path: https://graph.facebook.com/v10.0/17948370028410575/insights
Params: {'metric': '["carousel_album_engagement","carousel_album_impressions","carousel_album_reach","carousel_album_saved"]'}
Status: 400
Response:
{
"error": {
"message": "Invalid parameter",
"type": "OAuthException",
"code": 100,
"error_data": {
"blame_field_specs": [
[
""
]
]
},
"error_subcode": 2108006,
"is_transient": false,
"error_user_title": "Media posted before business account conversion",
"error_user_msg": "The media was posted before the most recent time that the user's account was converted to a business account from a personal account.",
"fbtrace_id": "AsEo4TGbJgj894OcT7XaCFL"
}
}
2021-03-24 06:06:37 INFO (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):110 - Syncing stories stream
2021-03-24 06:06:48 ERROR (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):108 - Encountered an exception while reading stream SourceInstagram
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/base_python/source.py", line 88, in read
yield from self._read_stream(logger=logger, client=client, configured_stream=configured_stream, state=total_state)
File "/usr/local/lib/python3.7/site-packages/base_python/source.py", line 106, in _read_stream
for record in client.read_stream(configured_stream.stream):
File "/usr/local/lib/python3.7/site-packages/base_python/client.py", line 166, in read_stream
for message in method(fields=fields):
File "/usr/local/lib/python3.7/site-packages/source_instagram/client/api.py", line 290, in list
yield clear_video_url(record_data)
File "/usr/local/lib/python3.7/site-packages/source_instagram/client/api.py", line 46, in clear_video_url
end_of_string = record_data["media_url"].find("&_nc_rid=")
KeyError: 'media_url'
2021-03-24 06:06:48 INFO (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):110 - Syncing story_insights stream
2021-03-24 06:07:27 ERROR (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):108 - Insights error: (#10) Not enough viewers for the media to show insights
2021-03-24 06:08:43 INFO (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):110 - Set state of user_insights stream to {'17841400192320675': '2021-03-23T07:00:00+00:00', '17841400284745138': '2021-03-23T07:00:00+00:00', '17841400346850774': '2021-03-23T07:00:00+00:00', '17841400489293275': '2021-03-23T07:00:00+00:00', '17841401318740840': '2021-03-23T07:00:00+00:00', '17841401446234888': '2021-03-23T07:00:00+00:00', '17841401781548125': '2021-03-23T07:00:00+00:00', '17841401980959720': '2021-03-23T07:00:00+00:00', '17841402916966028': '2021-03-23T07:00:00+00:00', '17841402958308126': '2021-03-23T07:00:00+00:00', '17841403029079681': '2021-03-23T07:00:00+00:00', '17841403071217400': '2021-03-23T07:00:00+00:00', '17841403177990120': '2021-03-23T07:00:00+00:00', '17841403348750148': '2021-03-23T07:00:00+00:00', '17841403497459345': '2021-03-23T07:00:00+00:00', '17841403573287506': '2021-03-23T07:00:00+00:00', '17841403884008942': '2021-03-23T07:00:00+00:00', '17841404108797506': '2021-03-23T07:00:00+00:00', '17841404135689658': '2021-03-23T07:00:00+00:00', '17841404181558895': '2021-03-23T07:00:00+00:00', '17841404189378325': '2021-03-23T07:00:00+00:00', '17841404191176661': '2021-03-23T07:00:00+00:00', '17841404204706856': '2021-03-23T07:00:00+00:00', '17841404206518515': '2021-03-23T07:00:00+00:00', '17841404217634823': '2021-03-23T07:00:00+00:00', '17841404222658404': '2021-03-23T07:00:00+00:00', '17841404247169359': '2021-03-23T07:00:00+00:00', '17841404249556597': '2021-03-23T07:00:00+00:00', '17841404277273118': '2021-03-23T07:00:00+00:00', '17841404289348916': '2021-03-23T07:00:00+00:00', '17841404291224111': '2021-03-23T07:00:00+00:00', '17841404295498412': '2021-03-23T07:00:00+00:00', '17841404309648630': '2021-03-23T07:00:00+00:00', '17841404414595147': '2021-03-23T07:00:00+00:00', '17841404414625698': '2021-03-23T07:00:00+00:00', '17841404417265499': '2021-03-23T07:00:00+00:00', '17841404479139399': '2021-03-23T07:00:00+00:00', '17841404515155637': '2021-03-23T07:00:00+00:00', '17841404526875296': '2021-03-23T07:00:00+00:00', '17841404550164361': '2021-03-23T07:00:00+00:00', '17841405395017198': '2021-03-23T07:00:00+00:00', '17841406062416044': '2021-03-23T07:00:00+00:00', '17841407013699980': '2021-03-23T07:00:00+00:00', '17841407037879823': '2021-03-23T07:00:00+00:00', '17841407132139874': '2021-03-23T07:00:00+00:00', '17841407156466071': '2021-03-23T07:00:00+00:00', '17841407200701023': '2021-03-23T07:00:00+00:00', '17841407218221212': '2021-03-23T07:00:00+00:00', '17841407224931898': '2021-03-23T07:00:00+00:00', '17841407336201984': '2021-03-23T07:00:00+00:00', '17841407357301285': '2021-03-23T07:00:00+00:00', '17841407382381164': '2021-03-23T07:00:00+00:00', '17841407382430084': '2021-03-23T07:00:00+00:00', '17841407401991982': '2021-03-23T07:00:00+00:00', '17841407437152137': '2021-03-23T07:00:00+00:00', '17841407477879305': '2021-03-23T07:00:00+00:00', '17841407479889231': '2021-03-23T07:00:00+00:00', '17841407517216804': '2021-03-23T07:00:00+00:00', '17841407535576014': '2021-03-23T07:00:00+00:00', '17841407542049081': '2021-03-23T07:00:00+00:00', '17841407606405883': '2021-03-23T07:00:00+00:00', '17841407622209315': '2021-03-23T07:00:00+00:00', '17841407695930820': '2021-03-23T07:00:00+00:00', '17841407758391657': '2021-03-23T07:00:00+00:00', '17841407761931817': '2021-03-23T07:00:00+00:00', '17841407783327203': '2021-03-23T07:00:00+00:00', '17841407802011612': '2021-03-23T07:00:00+00:00', '17841407804681480': '2021-03-23T07:00:00+00:00', '17841407875481752': '2021-03-23T07:00:00+00:00', '17841408081871487': '2021-03-23T07:00:00+00:00', '17841408181124483': '2021-03-23T07:00:00+00:00', '17841408542071957': '2021-03-23T07:00:00+00:00', '17841413777302998': '2021-03-23T07:00:00+00:00', '17841413976319144': '2021-03-23T07:00:00+00:00', '17841426779704172': '2021-03-23T07:00:00+00:00', '17841430918712351': '2021-03-23T07:00:00+00:00', '17841431142927045': '2021-03-23T07:00:00+00:00', '17841431308458310': '2021-03-23T07:00:00+00:00', '17841431327763937': '2021-03-23T07:00:00+00:00', '17841431374667263': '2021-03-23T07:00:00+00:00', '17841431398606029': '2021-03-23T07:00:00+00:00'}
2021-03-24 06:08:43 INFO (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):110 - Syncing user_insights stream
2021-03-24 06:08:44 INFO (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):110 - Syncing user_lifetime_insights stream
2021-03-24 06:08:58 INFO (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):110 - Syncing users stream
2021-03-24 06:09:07 ERROR (/tmp/workspace/58/0) LineGobbler(voidCall):69 - /usr/local/lib/python3.7/site-packages/facebook_business/utils/api_utils.py:30: UserWarning: media does not allow field children
2021-03-24 06:09:07 ERROR (/tmp/workspace/58/0) LineGobbler(voidCall):69 - warnings.warn(message)
2021-03-24 06:09:07 ERROR (/tmp/workspace/58/0) LineGobbler(voidCall):69 - /usr/local/lib/python3.7/site-packages/facebook_business/utils/api_utils.py:30: UserWarning: value of metric might not be compatible. Expect list<metric_enum>; got <class 'list'>
2021-03-24 06:09:07 ERROR (/tmp/workspace/58/0) LineGobbler(voidCall):69 - warnings.warn(message)
2021-03-24 06:09:07 ERROR (/tmp/workspace/58/0) LineGobbler(voidCall):69 - /usr/local/lib/python3.7/site-packages/facebook_business/utils/api_utils.py:30: UserWarning: value of period might not be compatible. Expect list<period_enum>; got <class 'str'>
2021-03-24 06:09:07 ERROR (/tmp/workspace/58/0) LineGobbler(voidCall):69 - warnings.warn(message)
2021-03-24 06:09:07 INFO (/tmp/workspace/58/0) DefaultAirbyteStreamFactory(internalLog):110 - Finished syncing SourceInstagram
2021-03-24 06:09:07 DEBUG (/tmp/workspace/58/0) DefaultAirbyteSource(close):109 - Closing tap process
2021-03-24 06:09:07 DEBUG (/tmp/workspace/58/0) DefaultAirbyteDestination(close):105 - Closing target process
2021-03-24 06:09:07 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:09:07 [32mINFO[m i.a.i.b.FailureTrackingConsumer(close):64 - {} - hasFailed: false.
2021-03-24 06:09:11 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:09:11 [1;31mERROR[m i.a.i.d.b.BigQueryDestination$RecordConsumer(close):344 - {} - executing on success close procedure.
2021-03-24 06:09:23 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:09:23 [32mINFO[m i.a.i.b.IntegrationRunner(run):120 - {} - Completed integration: io.airbyte.integrations.destination.bigquery.BigQueryDestination
2021-03-24 06:09:23 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 2021-03-24 06:09:23 [32mINFO[m i.a.i.d.b.BigQueryDestination(main):392 - {} - completed destination: class io.airbyte.integrations.destination.bigquery.BigQueryDestination
2021-03-24 06:09:23 INFO (/tmp/workspace/58/0) DefaultSyncWorker(run):113 - Running normalization.
2021-03-24 06:09:23 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Checking if airbyte/normalization:0.1.15 exists...
2021-03-24 06:09:23 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - airbyte/normalization:0.1.15 was found locally.
2021-03-24 06:09:23 DEBUG (/tmp/workspace/58/0) DockerProcessBuilderFactory(create):104 - Preparing command: docker run --rm -i -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -w /data/58/0/normalize --network host airbyte/normalization:0.1.15 run --integration-type bigquery --config destination_config.json --catalog destination_catalog.json
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Namespace(config='destination_config.json', integration_type=<DestinationType.bigquery: 'bigquery'>, out='/data/58/0/normalize')
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - transform_bigquery
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Processing destination_catalog.json...
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_ab1.sql from media
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_ab2.sql from media
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_ab3.sql from media
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_tables/airbyte_tripscout_instagram/media.sql from media
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_insights_ab1.sql from media_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_insights_ab2.sql from media_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_insights_ab3.sql from media_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_tables/airbyte_tripscout_instagram/media_insights.sql from media_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/stories_ab1.sql from stories
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/stories_ab2.sql from stories
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/stories_ab3.sql from stories
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_tables/airbyte_tripscout_instagram/stories.sql from stories
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/story_insights_ab1.sql from story_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/story_insights_ab2.sql from story_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/story_insights_ab3.sql from story_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_tables/airbyte_tripscout_instagram/story_insights.sql from story_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/user_insights_ab1.sql from user_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/user_insights_ab2.sql from user_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/user_insights_ab3.sql from user_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_tables/airbyte_tripscout_instagram/user_insights.sql from user_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/user_lifetime_insights_ab1.sql from user_lifetime_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/user_lifetime_insights_ab2.sql from user_lifetime_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/user_lifetime_insights_ab3.sql from user_lifetime_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_tables/airbyte_tripscout_instagram/user_lifetime_insights.sql from user_lifetime_insights
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/users_ab1.sql from users
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/users_ab2.sql from users
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/users_ab3.sql from users
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_tables/airbyte_tripscout_instagram/users.sql from users
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_0f8_owner_ab1.sql from media/owner
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_0f8_owner_ab2.sql from media/owner
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_0f8_owner_ab3.sql from media/owner
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_tables/airbyte_tripscout_instagram/media_0f8_owner.sql from media/owner
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_d39_children_ab1.sql from media/children
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_d39_children_ab2.sql from media/children
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_d39_children_ab3.sql from media/children
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_tables/airbyte_tripscout_instagram/media_d39_children.sql from media/children
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/stories_ce2_owner_ab1.sql from stories/owner
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/stories_ce2_owner_ab2.sql from stories/owner
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/stories_ce2_owner_ab3.sql from stories/owner
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_tables/airbyte_tripscout_instagram/stories_ce2_owner.sql from stories/owner
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Ignoring substream 'value' from user_lifetime_insights/value because properties list is empty
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_children_4f7_owner_ab1.sql from media/children/owner
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_children_4f7_owner_ab2.sql from media/children/owner
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_views/airbyte_tripscout_instagram/media_children_4f7_owner_ab3.sql from media/children/owner
2021-03-24 06:09:24 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Generating airbyte_tables/airbyte_tripscout_instagram/media_children_4f7_owner.sql from media/children/owner
2021-03-24 06:09:26 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Running with dbt=0.18.1
2021-03-24 06:09:29 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Found 44 models, 0 tests, 0 snapshots, 0 analyses, 341 macros, 0 operations, 0 seed files, 7 sources
2021-03-24 06:09:29 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 -
2021-03-24 06:09:29 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:29 | Concurrency: 32 threads (target='prod')
2021-03-24 06:09:29 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:29 |
2021-03-24 06:09:30 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:30 | 1 of 11 START table model airbyte_tripscout_instagram.media_insights......................................... [RUN]
2021-03-24 06:09:30 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:30 | 2 of 11 START table model airbyte_tripscout_instagram.user_insights.......................................... [RUN]
2021-03-24 06:09:30 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:30 | 3 of 11 START table model airbyte_tripscout_instagram.user_lifetime_insights................................. [RUN]
2021-03-24 06:09:30 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:30 | 4 of 11 START table model airbyte_tripscout_instagram.story_insights......................................... [RUN]
2021-03-24 06:09:30 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:30 | 5 of 11 START table model airbyte_tripscout_instagram.stories................................................ [RUN]
2021-03-24 06:09:30 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:30 | 7 of 11 START table model airbyte_tripscout_instagram.users.................................................. [RUN]
2021-03-24 06:09:30 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:30 | 6 of 11 START table model airbyte_tripscout_instagram.media.................................................. [RUN]
2021-03-24 06:09:32 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:32 | 7 of 11 OK created table model airbyte_tripscout_instagram.users............................................. [[32mCREATE TABLE (81.0 rows, 47.8 KB processed)[0m in 1.87s]
2021-03-24 06:09:32 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:32 | 1 of 11 OK created table model airbyte_tripscout_instagram.media_insights.................................... [[32mCREATE TABLE (120.0 rows, 20.1 KB processed)[0m in 2.28s]
2021-03-24 06:09:32 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:32 | 2 of 11 OK created table model airbyte_tripscout_instagram.user_insights..................................... [[32mCREATE TABLE (2.8k rows, 1.6 MB processed)[0m in 2.13s]
2021-03-24 06:09:32 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:32 | 4 of 11 OK created table model airbyte_tripscout_instagram.story_insights.................................... [[32mCREATE TABLE (191.0 rows, 36.3 KB processed)[0m in 2.10s]
2021-03-24 06:09:32 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:32 | 3 of 11 OK created table model airbyte_tripscout_instagram.user_lifetime_insights............................ [[32mCREATE TABLE (320.0 rows, 242.9 KB processed)[0m in 2.13s]
2021-03-24 06:09:32 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:32 | 6 of 11 OK created table model airbyte_tripscout_instagram.media............................................. [[32mCREATE TABLE (356.0 rows, 400.4 KB processed)[0m in 2.14s]
2021-03-24 06:09:32 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:32 | 8 of 11 START table model airbyte_tripscout_instagram.media_0f8_owner........................................ [RUN]
2021-03-24 06:09:33 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:33 | 9 of 11 START table model airbyte_tripscout_instagram.media_d39_children..................................... [RUN]
2021-03-24 06:09:33 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:33 | 5 of 11 OK created table model airbyte_tripscout_instagram.stories........................................... [[32mCREATE TABLE (22.0 rows, 19.6 KB processed)[0m in 2.75s]
2021-03-24 06:09:33 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:33 | 10 of 11 START table model airbyte_tripscout_instagram.stories_ce2_owner..................................... [RUN]
2021-03-24 06:09:34 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:34 | 8 of 11 OK created table model airbyte_tripscout_instagram.media_0f8_owner................................... [[32mCREATE TABLE (356.0 rows, 24.3 KB processed)[0m in 1.63s]
2021-03-24 06:09:34 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:34 | 10 of 11 OK created table model airbyte_tripscout_instagram.stories_ce2_owner................................ [[32mCREATE TABLE (22.0 rows, 1.5 KB processed)[0m in 1.27s]
2021-03-24 06:09:35 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:35 | 9 of 11 OK created table model airbyte_tripscout_instagram.media_d39_children................................ [[32mCREATE TABLE (147.0 rows, 92.0 KB processed)[0m in 2.19s]
2021-03-24 06:09:35 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:35 | 11 of 11 START table model airbyte_tripscout_instagram.media_children_4f7_owner.............................. [RUN]
2021-03-24 06:09:36 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:36 | 11 of 11 OK created table model airbyte_tripscout_instagram.media_children_4f7_owner......................... [[32mCREATE TABLE (147.0 rows, 10.0 KB processed)[0m in 1.49s]
2021-03-24 06:09:36 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:36 |
2021-03-24 06:09:36 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - 06:09:36 | Finished running 11 table models in 7.55s.
2021-03-24 06:09:36 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 -
2021-03-24 06:09:36 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - [32mCompleted successfully[0m
2021-03-24 06:09:36 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 -
2021-03-24 06:09:36 INFO (/tmp/workspace/58/0) LineGobbler(voidCall):69 - Done. PASS=11 WARN=0 ERROR=0 SKIP=0 TOTAL=11
2021-03-24 06:09:37 DEBUG (/tmp/workspace/58/0) DefaultNormalizationRunner(close):97 - Closing tap process
2021-03-24 06:09:37 INFO (/tmp/workspace/58/0) DefaultSyncWorker(run):130 - sync summary: io.airbyte.config.StandardSyncSummary@60ecdbb3[status=completed,recordsSynced=1090,bytesSynced=774574,startTime=1616565934923,endTime=1616566177262]
@jimbeepbeep we did some digging and found two issues:
@yevhenii-ldv is working on a fix for both issues and we'll have one for you soon!
Thanks team!
@jimbeepbeep
We just merged this bugfix into master a released a new version of the connector.
Please, upgrade your Instagram connector to version 0.1.2
pick up the bugfix. To upgrade your connector version, go to the admin panel in the left hand side of the UI, find this connector in the list, and input the latest connector version.
Please let us know if you have any further questions.
Enjoy!
Thanks @yevhenii-ldv I've upgraded the connector and re-run the sync, it looks like it's fixed the media_insights issue, which is awesome. However there are still only 3/81 accounts coming through for media:
Logs attached, let me know if you need anything else:
@jimbeepbeep could you try manually triggering a sync? I'm seeing this error in the logs
Status: 500
Response:
{
"error": {
"message": "An unexpected error has occurred. Please retry your request later.",
"type": "OAuthException",
"is_transient": true,
"code": 2,
"fbtrace_id": "AJ3EQr2jeWlBOTQFNzQWhUk"
}
}
this was an internal error in the FB server causing the connector to fail. Normally, the connector should retry the sync, but we discovered a related open bug yesterday: https://github.com/airbytehq/airbyte/issues/2616 . Rerunning the sync should show you these media assets.
Thanks @sherifnada I have just re-run the sync and it seems to be picking up media from more accounts (7/82
) but not all of them:
The sync now takes over 4 hours and most of the time seems to be handling these Media posted before business account conversion
errors. Logs attached (access token redacted):
Let me know if you need anything else.
@jimbeepbeep we've identified this issue and will address in #2626 -- the issue is that the FB server has transient 500 failures that we should just back off on and retry. Will ping you here when that issue is addressed.
Thanks, as always @sherifnada!
@jimbeepbeep
We just merged the bugfix of Issue #2626 into master a released a new version of the connector.
Please, upgrade your Instagram connector to version 0.1.3
pick up the bugfix. To upgrade your connector version, go to the admin panel in the left hand side of the UI, find this connector in the list, and input the latest connector version.
I hope this bugfix will fix the remaining problems with the Instagram Connector. Please let us know if you have any further questions.
Enjoy!
Great stuff @yevhenii-ldv and @sherifnada, I've re-run the sync and it completed fine... You guys are amazing, thanks!
Hey guys, not sure where the best place to raise issues... let me know if I should use Slack instead of here. I'm just looking through historic logs and there are a few fails, plus the sync from last night (6:13PM UTC) is still running after 2 fails, about 17 hours later:
Normally this completes as per yesterday morning: Succeeded 157.25 MB | 201,643 records | 6h 55m 16s | Sync
Logs below (access token redacted). I have checked the Facebook App dashboard and we're nowhere near app rate limits so it's definitely not that.
Attempt 1: logs-52-0.txt Attempt 2: logs-52-1.txt Attempt 3: logs-52-2.txt
I'm running airbyte/source-instagram 0.1.4 and Airbyte 0.19.0-alpha, I haven't updated in a while as this data is mission critical for a client and the process looks complicated enough that I'm worried I'll interrupt a sync or not be able to get it back up and running. Is there a simpler approach (or a script I can run) rather than this process in the docs?
@yevhenii-ldv / @sherifnada this connector is now failing, screenshot below, Let me know if you need any more logs in addition to the above...
Hello @jimbeepbeep. Thank you for your comment and for raising the issue. We'll investigate this error and your logs and find the cause of the problem.
Will ping you here when any new information on this issue becomes known.
@jimbeepbeep we've created an issue to track this here: https://github.com/airbytehq/airbyte/issues/3241
FWIW it's easier to create new issues for the instagram connector going forward
Awesome thanks
Tell us about the new integration you’d like to have
Which source and which destination? Which frequency?
Instagram insights to Google BigQuery, every <10 minutes(!). It's actually from the Facebook Graph API so it might be possible to reuse some aspects of the Facebook Marketing Source Connector.
Describe the context around this new integration
Which team in your company wants this integration, what for? This helps us understand the use case.
I have already built this as a custom extractor for a client using Google Cloud Functions - the reason is that:
https://graph.facebook.com/{graph-api-version}/{ig-user-id}/media?fields={fields}&access_token={access-token}
), some story-related ones from the story endpoint (https://graph.facebook.com/{graph-api-version}/{ig-user-id}/stories?fields={fields}&access_token={access-token}
) and more detailed insights from the insights endpoint (https://graph.facebook.com/v9.0/{ig-media-id}/insights?metric={metrics}&access_token={access-token}
). Additionally account insights are returned from the user insights endpoint (https://graph.facebook.com/v9.0/{ig-user-id}/insights?metric={metric}&period={period}&since={since}&until={until}&access_token={access-token}
)Describe the alternative you are considering or using
What are you considering doing if you don’t have this integration through Airbyte?
It was not viable to do this via e.g. Rivery as the sheer number of data transfers would have been prohibitively expensive, and the multi-account to single business account makes it incompatible with other tools. It feels like a great use-case for us to try out Airbyte, and I would be happy to share the (Python) code as deployed.
I am already running it via a set of Cloud Function (orchestrated via PubSub from other Cloud Function so each invocation only queries data from a single account), however there is an ongoing cost associated with this and I would rather move it to a dedicated platform and monitor/manage it alongside other flows instead of as custom code.