airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
16.11k stars 4.12k forks source link

[source-shopify] product_images stream fails #39478

Closed SebastienCY closed 4 months ago

SebastienCY commented 4 months ago

Connector Name

source-shopify

Connector Version

2.2.3

What step the error happened?

During the sync

Relevant information

The product_images stream sync is failing with error on source We encountered this issue syncing the data of 2 shops (out of ~15) consecutively to v2.2.3 upgrade

Reproducing the bulk request manually (using this query template) Shows this in the downloaded jsonl file

{"__typename":"Product","id":"gid:\/\/shopify\/Product\/9062091161885"}
{"__typename":"MediaImage","createdAt":"2024-06-12T23:41:27Z","updatedAt":"2024-06-12T23:41:28Z","image":null,"__parentId":"gid:\/\/shopify\/Product\/9062091161885"}

{"__typename":"Product","id":"gid:\/\/shopify\/Product\/9062093586717"}
{"__typename":"MediaImage","createdAt":"2024-06-12T23:21:17Z","updatedAt":"2024-06-12T23:21:19Z","image":{"url":"https:\/\/cdn.shopify.com\/xxxx"},"__parentId":"gid:\/\/shopify\/Product\/9062093586717"}
{"__typename":"Image","id":"gid:\/\/shopify\/ProductImage\/46359370301725","height":2000,"alt":"","src":"https:\/\/cdn.shopify.com\/xxxx","url":"https:\/\/cdn.shopify.com\/xxxx","width":2000,"__parentId":"gid:\/\/shopify\/Product\/9062093586717"}

Note the "image":null in the first MediaImage node. This might be the direct reason for the exception raised

A fix could consist in the _merge_with_media method, to ignore MediaImage nodes without an image url. Wdyt ?

Relevant log output

2024-06-13 10:34:24 source > Stream: `product_images` requesting BULK Job for period: 2024-06-09T14:24:00+00:00 -- 2024-06-13T10:33:10.633973+00:00. Slice size: `P3.8D`
2024-06-13 10:34:25 source > Stream: `product_images`, the BULK Job: `gid://shopify/BulkOperation/4519080821021` is CREATED
2024-06-13 10:34:25 source > API Load: `REGULAR`
2024-06-13 10:34:32 source > Stream: `product_images`, the BULK Job: `gid://shopify/BulkOperation/4519080821021` is COMPLETED
2024-06-13 10:34:33 source > Stream: `product_images`, the BULK Job: `gid://shopify/BulkOperation/4519080821021` time elapsed: 7.306 sec.
2024-06-13 10:34:33 source > Encountered an exception while reading stream product_images
Traceback (most recent call last):
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/record.py", line 136, in read_file
    yield from self.produce_records(filename)
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/record.py", line 130, in produce_records
    for record in self.process_line(jsonl_file):
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/record.py", line 95, in process_line
    yield from self.record_compose(loads(line))
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/record.py", line 81, in record_compose
    yield from self.buffer_flush()
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/record.py", line 69, in buffer_flush
    yield from self.record_process_components(record)
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/query.py", line 1875, in record_process_components
    record["images"] = self._merge_with_media(record_components)
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/query.py", line 1843, in _merge_with_media
    image_url = item.get("image", {}).get("url")
AttributeError: 'NoneType' object has no attribute 'get'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 135, in read
    yield from self._read_stream(
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 230, in _read_stream
    for record_data_or_message in record_iterator:
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/core.py", line 169, in read
    for record_data_or_message in records:
  File "/airbyte/integration_code/source_shopify/streams/base_streams.py", line 770, in read_records
    yield from self.filter_records_newer_than_state(stream_state, records)
  File "/airbyte/integration_code/source_shopify/streams/base_streams.py", line 243, in filter_records_newer_than_state
    for index, record in enumerate(records_slice, 1):
  File "/airbyte/integration_code/source_shopify/streams/base_streams.py", line 675, in add_shop_url_field
    for record in records:
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/record.py", line 138, in read_file
    raise ShopifyBulkExceptions.BulkRecordProduceError(
source_shopify.shopify_graphql.bulk.exceptions.ShopifyBulkExceptions.BulkRecordProduceError: An error occured while producing records from BULK Job result. Trace: AttributeError("'NoneType' object has no attribute 'get'").
2024-06-13 10:34:33 source > Marking stream product_images as STOPPED

Contribute

bazarnov commented 4 months ago

@Srooney3 The Shopify has the BULK limitations for 1 Bulk Job at a time. Therefore:

We encountered this issue syncing the data of 2 shops (out of ~15) consecutively to v2.2.3 upgrade

Will continue, if there is a need to fetch the same data from the same shop at the same time.

The issue for the product_images has been fixed here: https://github.com/airbytehq/airbyte/pull/37767 Please update the source-shopify to the latest version possible, in order to avoid such errors.

We may close this issue, as already fixed, here: https://github.com/airbytehq/airbyte/pull/37767

bazarnov commented 4 months ago

@Srooney3 Please don't create public issues for internal usage, we have another repo for such cases: https://github.com/airbytehq/airbyte-internal-issues

SebastienCY commented 4 months ago

Hello @bazarnov Currently operating on airbyte cloud with a v2.4.6 shopify connector, we just met the same issue again. Please consider the data sample provided in the issue description. The exception is the direct consequence of the connector not handling the case of a null value for MediaImage.image when it expects a dict. Hope this can help solving the issue.

2024-07-01 01:04:50 platform > Replication output for workload 502e93c7-4518-4b0d-b753-8cc0052fa0c5_14029216_0_sync : io.airbyte.config.ReplicationOutput@14523509[replicationAttemptSummary=io.airbyte.config.ReplicationAttemptSummary@3fac2613[status=failed,recordsSynced=1188,bytesSynced=932751,startTime=1719795745601,endTime=1719795865734,totalStats=io.airbyte.config.SyncStats@657afc48[bytesCommitted=932751,bytesEmitted=1012205,destinationStateMessagesEmitted=10,destinationWriteEndTime=0,destinationWriteStartTime=1719795745790,estimatedBytes=<null>,estimatedRecords=<null>,meanSecondsBeforeSourceStateMessageEmitted=13,maxSecondsBeforeSourceStateMessageEmitted=4,maxSecondsBetweenStateMessageEmittedandCommitted=94,meanSecondsBetweenStateMessageEmittedandCommitted=54,recordsEmitted=1371,recordsCommitted=1188,replicationEndTime=1719795864271,replicationStartTime=1719795745601,sourceReadEndTime=0,sourceReadStartTime=1719795745798,sourceStateMessagesEmitted=10,discoverSchemaEndTime=<null>,discoverSchemaStartTime=<null>,additionalProperties={}],streamStats=[io.airbyte.config.StreamSyncStats@21f1c598[streamName=product_images,streamNamespace=<null>,stats=io.airbyte.config.SyncStats@240e56d2[bytesCommitted=108508,bytesEmitted=187962,destinationStateMessagesEmitted=<null>,destinationWriteEndTime=<null>,destinationWriteStartTime=<null>,estimatedBytes=<null>,estimatedRecords=<null>,meanSecondsBeforeSourceStateMessageEmitted=<null>,maxSecondsBeforeSourceStateMessageEmitted=<null>,maxSecondsBetweenStateMessageEmittedandCommitted=<null>,meanSecondsBetweenStateMessageEmittedandCommitted=<null>,recordsEmitted=433,recordsCommitted=250,replicationEndTime=<null>,replicationStartTime=<null>,sourceReadEndTime=<null>,sourceReadStartTime=<null>,sourceStateMessagesEmitted=<null>,discoverSchemaEndTime=<null>,discoverSchemaStartTime=<null>,additionalProperties={}],wasBackfilled=<null>,wasResumed=<null>,additionalProperties={}], io.airbyte.config.StreamSyncStats@7a4c60fe[streamName=products,streamNamespace=<null>,stats=io.airbyte.config.SyncStats@48664b43[bytesCommitted=467664,bytesEmitted=467664,destinationStateMessagesEmitted=<null>,destinationWriteEndTime=<null>,destinationWriteStartTime=<null>,estimatedBytes=<null>,estimatedRecords=<null>,meanSecondsBeforeSourceStateMessageEmitted=<null>,maxSecondsBeforeSourceStateMessageEmitted=<null>,maxSecondsBetweenStateMessageEmittedandCommitted=<null>,meanSecondsBetweenStateMessageEmittedandCommitted=<null>,recordsEmitted=391,recordsCommitted=391,replicationEndTime=<null>,replicationStartTime=<null>,sourceReadEndTime=<null>,sourceReadStartTime=<null>,sourceStateMessagesEmitted=<null>,discoverSchemaEndTime=<null>,discoverSchemaStartTime=<null>,additionalProperties={}],wasBackfilled=<null>,wasResumed=<null>,additionalProperties={}], io.airbyte.config.StreamSyncStats@67948ad2[streamName=custom_collections,streamNamespace=<null>,stats=io.airbyte.config.SyncStats@31fae834[bytesCommitted=41269,bytesEmitted=41269,destinationStateMessagesEmitted=<null>,destinationWriteEndTime=<null>,destinationWriteStartTime=<null>,estimatedBytes=<null>,estimatedRecords=<null>,meanSecondsBeforeSourceStateMessageEmitted=<null>,maxSecondsBeforeSourceStateMessageEmitted=<null>,maxSecondsBetweenStateMessageEmittedandCommitted=<null>,meanSecondsBetweenStateMessageEmittedandCommitted=<null>,recordsEmitted=129,recordsCommitted=129,replicationEndTime=<null>,replicationStartTime=<null>,sourceReadEndTime=<null>,sourceReadStartTime=<null>,sourceStateMessagesEmitted=<null>,discoverSchemaEndTime=<null>,discoverSchemaStartTime=<null>,additionalProperties={}],wasBackfilled=<null>,wasResumed=<null>,additionalProperties={}], io.airbyte.config.StreamSyncStats@28a83565[streamName=metafield_products,streamNamespace=<null>,stats=io.airbyte.config.SyncStats@526e5f60[bytesCommitted=1660,bytesEmitted=1660,destinationStateMessagesEmitted=<null>,destinationWriteEndTime=<null>,destinationWriteStartTime=<null>,estimatedBytes=<null>,estimatedRecords=<null>,meanSecondsBeforeSourceStateMessageEmitted=<null>,maxSecondsBeforeSourceStateMessageEmitted=<null>,maxSecondsBetweenStateMessageEmittedandCommitted=<null>,meanSecondsBetweenStateMessageEmittedandCommitted=<null>,recordsEmitted=4,recordsCommitted=4,replicationEndTime=<null>,replicationStartTime=<null>,sourceReadEndTime=<null>,sourceReadStartTime=<null>,sourceStateMessagesEmitted=<null>,discoverSchemaEndTime=<null>,discoverSchemaStartTime=<null>,additionalProperties={}],wasBackfilled=<null>,wasResumed=<null>,additionalProperties={}], io.airbyte.config.StreamSyncStats@11ef9de5[streamName=product_variants,streamNamespace=<null>,stats=io.airbyte.config.SyncStats@77b27ca4[bytesCommitted=313650,bytesEmitted=313650,destinationStateMessagesEmitted=<null>,destinationWriteEndTime=<null>,destinationWriteStartTime=<null>,estimatedBytes=<null>,estimatedRecords=<null>,meanSecondsBeforeSourceStateMessageEmitted=<null>,maxSecondsBeforeSourceStateMessageEmitted=<null>,maxSecondsBetweenStateMessageEmittedandCommitted=<null>,meanSecondsBetweenStateMessageEmittedandCommitted=<null>,recordsEmitted=414,recordsCommitted=414,replicationEndTime=<null>,replicationStartTime=<null>,sourceReadEndTime=<null>,sourceReadStartTime=<null>,sourceStateMessagesEmitted=<null>,discoverSchemaEndTime=<null>,discoverSchemaStartTime=<null>,additionalProperties={}],wasBackfilled=<null>,wasResumed=<null>,additionalProperties={}]],performanceMetrics=io.airbyte.config.PerformanceMetrics@1ed86c2a[additionalProperties={processFromSource={elapsedTimeInNanos=1337292474, executionCount=1398, avgExecTimeInNanos=956575.4463519313}, readFromSource={elapsedTimeInNanos=79950933220, executionCount=144742, avgExecTimeInNanos=552368.5814760056}, processFromDest={elapsedTimeInNanos=673215408, executionCount=11, avgExecTimeInNanos=6.120140072727273E7}, writeToDest={elapsedTimeInNanos=463572286, executionCount=1381, avgExecTimeInNanos=335678.7009413469}, readFromDest={elapsedTimeInNanos=108050963725, executionCount=225085, avgExecTimeInNanos=480045.1550525357}}],additionalProperties={}],state=<null>,outputCatalog=io.airbyte.protocol.models.ConfiguredAirbyteCatalog@3eeda5ce[streams=[io.airbyte.protocol.models.ConfiguredAirbyteStream@53000743[stream=io.airbyte.protocol.models.AirbyteStream@37b306db[name=airbyte_shopify_biofloral_france_myshopify_com_custom_collections,jsonSchema={"type":["null","object"],"properties":{"id":{"type":["null","integer"],"description":"The unique identifier of the custom collection."},"image":{"type":["null","object"],"properties":{"alt":{"type":["null","string"],"description":"The alternative text description of the image."},"src":{"type":["null","string"],"description":"The URL of the image."},"width":{"type":["null","integer"],"description":"The width of the image in pixels."},"height":{"type":["null","integer"],"description":"The height of the image in pixels."},"created_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the image was created."}},"description":"Represents the image associated with the custom collection if available."},"title":{"type":["null","string"],"description":"The title of the custom collection."},"handle":{"type":["null","string"],"description":"The unique URL-friendly string that identifies the custom collection."},"shop_url":{"type":["null","string"],"description":"The URL of the shop where the custom collection belongs."},"body_html":{"type":["null","string"],"description":"The full description of the custom collection for display purposes."},"deleted_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the custom collection was deleted."},"sort_order":{"type":["null","string"],"description":"The order in which the custom collection should be displayed."},"updated_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the custom collection was last updated."},"published_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the custom collection was published."},"deleted_message":{"type":["null","string"],"description":"Any additional message related to the deletion of the custom collection."},"published_scope":{"type":["null","string"],"description":"The scope where the custom collection is published (global or web)."},"template_suffix":{"type":["null","string"],"description":"The template suffix for the custom collection's URL."},"deleted_description":{"type":["null","string"],"description":"The description of why the custom collection was deleted."},"admin_graphql_api_id":{"type":["null","string"],"description":"The unique identifier of the custom collection accessible via GraphQL Admin API."}},"additionalProperties":true,"$schema":"http://json-schema.org/draft-07/schema#"},supportedSyncModes=[full_refresh, incremental],sourceDefinedCursor=true,defaultCursorField=[updated_at],sourceDefinedPrimaryKey=[[id]],namespace=inula_raw,isResumable=<null>,additionalProperties={}],syncMode=incremental,cursorField=[updated_at],destinationSyncMode=append_dedup,primaryKey=[[id]],generationId=1,minimumGenerationId=0,syncId=14029216,additionalProperties={}], io.airbyte.protocol.models.ConfiguredAirbyteStream@6ad7953b[stream=io.airbyte.protocol.models.AirbyteStream@39c78e7a[name=airbyte_shopify_biofloral_france_myshopify_com_metafield_products,jsonSchema={"type":["null","object"],"properties":{"id":{"type":["null","integer"],"description":"A unique identifier for the metafield."},"key":{"type":["null","string"],"description":"The key or name that identifies the metafield."},"type":{"type":["null","string"],"description":"The type of the metafield value, such as 'string', 'integer', 'json_string', etc."},"value":{"type":["null","string"],"description":"The actual value of the metafield based on its type."},"owner_id":{"type":["null","integer"],"description":"The unique identifier of the resource that owns the metafield."},"shop_url":{"type":["null","string"],"description":"The shop URL where the metafield is associated with."},"namespace":{"type":["null","string"],"description":"The namespace for the metafield, helping to group related metafields together."},"created_at":{"type":["null","string"],"format":"date-time","description":"The date and time the metafield was created in ISO 8601 format."},"updated_at":{"type":["null","string"],"format":"date-time","description":"The date and time the metafield was last updated in ISO 8601 format."},"value_type":{"type":["null","string"],"description":"A representation of the type of the value (for example, 'string' or 'integer')."},"description":{"type":["null","string"],"description":"The description of the metafield, providing additional information."},"owner_resource":{"type":["null","string"],"description":"The type of resource that owns the metafield, such as 'product' or 'collection'."},"admin_graphql_api_id":{"type":["null","string"],"description":"A unique identifier for the metafield used in the Shopify Admin GraphQL API."}},"additionalProperties":true,"$schema":"http://json-schema.org/draft-07/schema#"},supportedSyncModes=[full_refresh, incremental],sourceDefinedCursor=true,defaultCursorField=[updated_at],sourceDefinedPrimaryKey=[[id]],namespace=inula_raw,isResumable=<null>,additionalProperties={}],syncMode=incremental,cursorField=[updated_at],destinationSyncMode=append_dedup,primaryKey=[[id]],generationId=1,minimumGenerationId=0,syncId=14029216,additionalProperties={}], io.airbyte.protocol.models.ConfiguredAirbyteStream@6aaebc1e[stream=io.airbyte.protocol.models.AirbyteStream@4504fa4e[name=airbyte_shopify_biofloral_france_myshopify_com_product_images,jsonSchema={"type":["null","object"],"properties":{"id":{"type":["null","integer"],"description":"Unique identifier for the image"},"alt":{"type":["null","string"],"description":"Alternative text description of the image for accessibility"},"src":{"type":["null","string"],"description":"URL of the image"},"width":{"type":["null","integer"],"description":"Width of the image in pixels"},"height":{"type":["null","integer"],"description":"Height of the image in pixels"},"position":{"type":["null","integer"],"description":"Position order of the image relative to other images of the same product"},"shop_url":{"type":["null","string"],"description":"URL of the shop where the image is hosted"},"created_at":{"type":["null","string"],"format":"date-time","description":"Date and time when the image was created"},"product_id":{"type":["null","integer"],"description":"Unique identifier of the product associated with the image"},"updated_at":{"type":["null","string"],"format":"date-time","description":"Date and time when the image was last updated"},"variant_ids":{"type":["null","array"],"items":{"type":["null","integer"]},"description":"Array of unique identifiers for the product variants associated with the image"},"admin_graphql_api_id":{"type":["null","string"],"description":"Unique identifier for the image in the Admin GraphQL API"}},"additionalProperties":true,"$schema":"http://json-schema.org/draft-07/schema#"},supportedSyncModes=[full_refresh, incremental],sourceDefinedCursor=true,defaultCursorField=[updated_at],sourceDefinedPrimaryKey=[[id]],namespace=inula_raw,isResumable=<null>,additionalProperties={}],syncMode=incremental,cursorField=[updated_at],destinationSyncMode=append_dedup,primaryKey=[[id]],generationId=1,minimumGenerationId=0,syncId=14029216,additionalProperties={}], io.airbyte.protocol.models.ConfiguredAirbyteStream@13acd198[stream=io.airbyte.protocol.models.AirbyteStream@169d39c9[name=airbyte_shopify_biofloral_france_myshopify_com_products,jsonSchema={"type":["object","null"],"properties":{"id":{"type":["null","integer"],"description":"The unique identifier of the product."},"tags":{"type":["null","string"],"description":"Tags associated with the product."},"image":{"type":["null","object"],"properties":{"id":{"type":["null","integer"],"description":"The unique identifier of the image."},"alt":{"type":["null","string"],"description":"The alternative text for the image."},"src":{"type":["null","string"],"description":"The URL of the image source."},"width":{"type":["null","integer"],"description":"The width of the image."},"height":{"type":["null","integer"],"description":"The height of the image."},"position":{"type":["null","integer"],"description":"The position of the image."},"created_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the image was created."},"product_id":{"type":["null","integer"],"description":"The unique identifier of the product associated with the image."},"updated_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the image was last updated."},"variant_ids":{"type":["null","array"],"items":{"type":["null","integer"],"description":"List of variant IDs associated with the image."},"description":"Array of variant IDs associated with this image."},"admin_graphql_api_id":{"type":["null","string"],"description":"The unique identifier of the image in the Admin GraphQL API."}},"description":"Represents the main product image linked to one or more variants."},"title":{"type":["null","string"],"description":"The title of the product."},"handle":{"type":["null","string"],"description":"The human-readable URL for the product."},"images":{"type":["null","array"],"items":{"type":["null","object"],"properties":{"id":{"type":["null","integer"],"description":"The unique identifier of the image."},"alt":{"type":["null","string"],"description":"The alternative text for the image."},"src":{"type":["null","string"],"description":"The URL of the image source."},"width":{"type":["null","integer"],"description":"The width of the image."},"height":{"type":["null","integer"],"description":"The height of the image."},"position":{"type":["null","integer"],"description":"The position of the image."},"created_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the image was created."},"product_id":{"type":["null","integer"],"description":"The unique identifier of the product associated with the image."},"updated_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the image was last updated."},"variant_ids":{"type":["null","array"],"items":{"type":["null","integer"],"description":"List of variant IDs associated with the image."},"description":"Array of variant IDs associated with each image."},"admin_graphql_api_id":{"type":["null","string"],"description":"The unique identifier of the image in the Admin GraphQL API."}}},"description":"Represents a collection of additional images related to the product."},"status":{"type":["null","string"],"description":"The status of the product."},"vendor":{"type":["null","string"],"description":"The vendor or manufacturer of the product."},"options":{"type":["null","array"],"items":{"type":["null","object"],"properties":{"id":{"type":["null","integer"],"description":"The unique identifier of the product option."},"name":{"type":["null","string"],"description":"The name of the product option."},"values":{"type":["null","array"],"items":{"type":["null","string"],"description":"List of values associated with the product option."},"description":"Possible values that can be selected for each option."},"position":{"type":["null","integer"],"description":"The position of the product option."},"product_id":{"type":["null","integer"],"description":"The unique identifier of the product."}}},"description":"Represents different customizable options available for the product."},"shop_url":{"type":["null","string"],"description":"The URL of the shop where the product is listed."},"variants":{"type":["null","array"],"items":{"type":["null","object"],"properties":{"id":{"type":["null","integer"],"description":"The unique identifier of the variant."},"sku":{"type":["null","string"],"description":"The stock keeping unit (SKU) of the variant."},"grams":{"type":["null","integer"],"description":"The weight of the variant in grams."},"price":{"type":["null","number"],"description":"The price of the variant."},"title":{"type":["null","string"],"description":"The title of the variant."},"weight":{"type":["null","number"],"description":"The weight of the variant."},"barcode":{"type":["null","string"],"description":"The barcode of the variant."},"option1":{"type":["null","string"],"description":"The value of option 1 for the variant."},"option2":{"type":["null","string"],"description":"The value of option 2 for the variant."},"option3":{"type":["null","string"],"description":"The value of option 3 for the variant."},"taxable":{"type":["null","boolean"],"description":"Indicates if the variant is taxable."},"image_id":{"type":["null","integer"],"description":"The unique identifier of the image associated with the variant."},"position":{"type":["null","integer"],"description":"The position of the variant."},"tax_code":{"type":["null","string"],"description":"The tax code for the variant."},"created_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the variant was created."},"product_id":{"type":["null","integer"],"description":"The unique identifier of the product associated with the variant."},"updated_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the variant was last updated."},"weight_unit":{"type":["null","string"],"description":"The unit of weight for the variant."},"compare_at_price":{"type":["null","number"],"description":"The original price of the product before any discounts."},"inventory_policy":{"type":["null","string"],"description":"The inventory policy for the variant."},"inventory_item_id":{"type":["null","integer"],"description":"The unique identifier of the inventory item associated with the variant."},"requires_shipping":{"type":["null","boolean"],"description":"Indicates if the variant requires shipping."},"inventory_quantity":{"type":["null","integer"],"description":"The available quantity of the variant."},"presentment_prices":{"type":["null","array"],"items":{"type":["null","object"],"properties":{"price":{"type":["null","object"],"properties":{"amount":{"type":["null","number"],"description":"The price amount."},"currency_code":{"type":["null","string"],"description":"The currency code of the price."}},"description":"The price of the product variant."},"compare_at_price":{"type":["null","number"],"description":"The compare at price in different currencies."}}},"description":"Prices displayed to customers in different currencies or formats."},"fulfillment_service":{"type":["null","string"],"description":"The fulfillment service for the variant."},"admin_graphql_api_id":{"type":["null","string"],"description":"The unique identifier of the variant in the Admin GraphQL API."},"inventory_management":{"type":["null","string"],"description":"The management method for the variant inventory."},"old_inventory_quantity":{"type":["null","integer"],"description":"The previous quantity of the variant before change."}}},"description":"Represents different versions or variations of the product."},"body_html":{"type":["null","string"],"description":"The HTML description of the product."},"created_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the product was created."},"deleted_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the product was deleted."},"updated_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the product was last updated."},"description":{"type":["null","string"],"description":"The product's description."},"media_count":{"type":["null","integer"],"description":"The total count of media (images/videos) associated with the product."},"is_gift_card":{"type":["null","boolean"],"description":"Indicates whether the product is a gift card."},"product_type":{"type":["null","string"],"description":"The type or category of the product."},"published_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the product was published."},"total_variants":{"type":["null","integer"],"description":"The total number of variants available for the product."},"deleted_message":{"type":["null","string"],"description":"Message related to the deletion of the product."},"published_scope":{"type":["null","string"],"description":"The scope of where the product is available for purchase."},"template_suffix":{"type":["null","string"],"description":"The template suffix used for the product."},"total_inventory":{"type":["null","integer"],"description":"The total inventory count of the product."},"description_html":{"type":["null","string"],"description":"The product's description in HTML format."},"online_store_url":{"type":["null","string"],"description":"The URL of the product on the online store."},"tracks_inventory":{"type":["null","boolean"],"description":"Indicates whether inventory tracking is enabled for the product."},"legacy_resource_id":{"type":["null","string"],"description":"The legacy resource ID of the product."},"deleted_description":{"type":["null","string"],"description":"Description of the reason for deletion."},"admin_graphql_api_id":{"type":["null","string"],"description":"The unique identifier of the product in the Admin GraphQL API."},"online_store_preview_url":{"type":["null","string"],"description":"The URL for previewing the product on the online store."}},"additionalProperties":true,"$schema":"http://json-schema.org/draft-07/schema#"},supportedSyncModes=[full_refresh, incremental],sourceDefinedCursor=true,defaultCursorField=[updated_at],sourceDefinedPrimaryKey=[[id]],namespace=inula_raw,isResumable=<null>,additionalProperties={}],syncMode=incremental,cursorField=[updated_at],destinationSyncMode=append_dedup,primaryKey=[[id]],generationId=1,minimumGenerationId=0,syncId=14029216,additionalProperties={}], io.airbyte.protocol.models.ConfiguredAirbyteStream@94c2948[stream=io.airbyte.protocol.models.AirbyteStream@6b7874a5[name=airbyte_shopify_biofloral_france_myshopify_com_product_variants,jsonSchema={"type":["null","object"],"properties":{"id":{"type":["null","integer"],"description":"The unique identifier for the variant"},"sku":{"type":["null","string"],"description":"The unique SKU (stock keeping unit) of the variant"},"grams":{"type":["null","integer"],"description":"The weight of the variant in grams"},"price":{"type":["null","number"],"description":"The price of the variant"},"title":{"type":["null","string"],"description":"The title of the variant"},"weight":{"type":["null","number"],"description":"The weight of the variant"},"barcode":{"type":["null","string"],"description":"The barcode associated with the variant"},"option1":{"type":["null","string"],"description":"The value for option 1 of the variant"},"option2":{"type":["null","string"],"description":"The value for option 2 of the variant"},"option3":{"type":["null","string"],"description":"The value for option 3 of the variant"},"taxable":{"type":["null","boolean"],"description":"Indicates whether taxes are applied to the variant"},"image_id":{"type":["null","integer"],"description":"The unique identifier for the image associated with the variant"},"position":{"type":["null","integer"],"description":"The position of the variant in the product's list of variants"},"shop_url":{"type":["null","string"],"description":"The URL of the shop where the variant is listed"},"created_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the variant was created"},"product_id":{"type":["null","integer"],"description":"The unique identifier for the product associated with the variant"},"updated_at":{"type":["null","string"],"format":"date-time","description":"The date and time when the variant was last updated"},"weight_unit":{"type":["null","string"],"description":"The unit of measurement for the weight of the variant"},"compare_at_price":{"type":["null","string"],"description":"The original price of the variant before any discount"},"inventory_policy":{"type":["null","string"],"description":"The inventory policy for the variant"},"inventory_item_id":{"type":["null","integer"],"description":"The unique identifier for the inventory item associated with the variant"},"requires_shipping":{"type":["null","boolean"],"description":"Indicates whether the variant requires shipping"},"inventory_quantity":{"type":["null","integer"],"description":"The current inventory quantity for the variant"},"presentment_prices":{"type":["null","array"],"items":{"type":["null","object"],"properties":{"price":{"type":["null","object"],"properties":{"amount":{"type":["null","number"],"description":"The amount of the price"},"currency_code":{"type":["null","string"],"description":"The currency code of the price"}},"description":"The price of the variant in a different currency"},"compare_at_price":{"type":["null","object"],"properties":{"amount":{"type":["null","number"],"description":"The amount of the price"},"currency_code":{"type":["null","string"],"description":"The currency code of the price"}}}}},"description":"The prices of the variant for presentation in different currencies"},"fulfillment_service":{"type":["null","string"],"description":"The fulfillment service for the variant"},"admin_graphql_api_id":{"type":["null","string"],"description":"The unique identifier for the variant used by the GraphQL Admin API"},"inventory_management":{"type":["null","string"],"description":"The method used to manage inventory for the variant"},"old_inventory_quantity":{"type":["null","integer"],"description":"The previous inventory quantity for the variant"}},"additionalProperties":true,"$schema":"http://json-schema.org/draft-07/schema#"},supportedSyncModes=[full_refresh, incremental],sourceDefinedCursor=true,defaultCursorField=[updated_at],sourceDefinedPrimaryKey=[[id]],namespace=inula_raw,isResumable=<null>,additionalProperties={}],syncMode=incremental,cursorField=[updated_at],destinationSyncMode=append_dedup,primaryKey=[[id]],generationId=1,minimumGenerationId=0,syncId=14029216,additionalProperties={}]],additionalProperties={}],failures=[io.airbyte.config.FailureReason@1e2d5843[failureOrigin=source,failureType=config_error,internalMessage=An error occured while producing records from BULK Job result. Trace: AttributeError("'NoneType' object has no attribute 'get'").,externalMessage=Something went wrong in the connector. See the logs for more details.,metadata=io.airbyte.config.Metadata@6101693e[additionalProperties={attemptNumber=0, jobId=14029216, from_trace_message=true, connector_command=read}],stacktrace=Traceback (most recent call last):
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/record.py", line 136, in read_file
    yield from self.produce_records(filename)
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/record.py", line 130, in produce_records
    for record in self.process_line(jsonl_file):
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/record.py", line 95, in process_line
    yield from self.record_compose(loads(line))
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/record.py", line 81, in record_compose
    yield from self.buffer_flush()
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/record.py", line 69, in buffer_flush
    yield from self.record_process_components(record)
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/query.py", line 2346, in record_process_components
    record["images"] = self._merge_with_media(record_components)
  File "/airbyte/integration_code/source_shopify/shopify_graphql/bulk/query.py", line 2314, in _merge_with_media
    image_url = item.get("image", {}).get("url")
AttributeError: 'NoneType' object has no attribute 'get'
bazarnov commented 4 months ago

@SebastienCY Thank you for the clarification, I've successfully reproduced and included the fix here, it'll be merged as soon as possible: https://github.com/airbytehq/airbyte/pull/40593

SebastienCY commented 4 months ago

@bazarnov thank you !

SebastienCY commented 4 months ago

Hello @bazarnov We may have another issue a little further down the sync process consequently to this fix. Not sure I get all the ins and outs of the behaviour, but maybe it is that fixing the url enrichment, just allowed the connector to render records with a missing id (image id) ? Wdyt ?

bazarnov commented 4 months ago

@SebastienCY I'll definitely take a look at it, but it looks like we introduced the regression with this change when the destination expects the primary_key such as id is not null and should exist for the record.

For the missing Image record part, the record will look like this:

{
            "created_at": "2024-06-12T23:41:27+00:00",
            "updated_at": "2024-06-12T23:41:28+00:00",
            "image": None,
            "shop_url": "test_shop"
        }

No, id field. Because there is no Image attached to the product, the record itself looks useless to me and should not be replicated without an image context. Correct me if I'm wrong @SebastienCY ?

if so, we should be skipping the records that don't hold the Image component at all. I'll prepare the fix for such cases.