airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.48k stars 3.99k forks source link

Salesforce source appears to be Airtable through API #43461

Open fulljoin-home opened 1 month ago

fulljoin-home commented 1 month ago

Platform Version

airbyte-api==0.51.0

What step the error happened?

During the Sync

Relevant information

When I am trying to create a Salesforce source through airbyte_api python package.

import airbyte_api
from airbyte_api import models, api
from datetime import datetime

s = airbyte_api.AirbyteAPI(
    security=models.Security(
        client_credentials=models.SchemeClientCredentials(
            client_id=client_id,
            client_secret=client_secret,
            TOKEN_URL="/v1/applications/token",
        ),
    ),
)

res = s.sources.create_source(request=models.SourceCreateRequest(
    configuration=models.SourceSalesforce(
        client_id=client_id,
        client_secret=client_secret,
        refresh_token=refresh_token,
        force_use_bulk_api=True,
        start_date='2019-01-01T00:00:00Z'
    ),
    name='Salesforce test',
    workspace_id='31c70a11-2467-476d-b077-1029687aecc8',
))

print(res)

It gives me this response:

CreateSourceResponse(content_type='application/json', status_code=200, raw_response=<Response [200]>, source_response=SourceResponse(configuration=SourceAirtable(credentials=None, SOURCE_TYPE=<SourceAirtableAirtable.AIRTABLE: 'airtable'>), name='Salesforce test5', source_id='32a21312-4598-4f29-9636-545b5b7d114e', source_type='salesforce', workspace_id='31c70a11-2467-476d-b077-1029687aecc8'))

Not sure why it attempts to create SourceAirtable?

And testing this newly created source connectivity fails on Airbyte cloud UI.

Creating Salesforce source from the UI creates SourceAirtable object as well. (Have tried listing sources through the API and it returns 2 Airtable sources).

Relevant log output

CreateSourceResponse(content_type='application/json', status_code=200, raw_response=<Response [200]>, source_response=SourceResponse(configuration=SourceAirtable(credentials=None, SOURCE_TYPE=<SourceAirtableAirtable.AIRTABLE: 'airtable'>), name='Salesforce test5', source_id='32a21312-4598-4f29-9636-545b5b7d114e', source_type='salesforce', workspace_id='31c70a11-2467-476d-b077-1029687aecc8'))
marcosmarxm commented 1 month ago

cc @airbytehq/platform-compose

I had the same issue locally. Can someone take a look?

pedroslopez commented 2 weeks ago

I took an initial look at this and it does seem like there's an issue with our API responses that are causing a mismatch. However, this only affects the python typings and not the actual effect: the correct type of source is created in airbyte.

As you can see from the example provided in the issue, the CreateSourceResponse actually has the correct source returned in res.source_type. However, the wrong type is set on res.source_response.SOURCE_TYPE.

It seems that when sending a source create request, the source_type is set inside the configuration object, like so:

POST /v1/sources
{
  "configuration": {
    "sourceType": "salesforce",
    "auth_type": "Client",
    "client_id": "asdf",
    "client_secret": "asdf",
    "refresh_token": "asdf"
  },
  "workspaceId": "my-id",
  "name": "my source from api"
}

However, in the response, the sourceType lives outside of the configuration object:

{
    "sourceId": "41d28a8d-d2d5-4e2c-add8-f945402abd36",
    "name": "my source from api",
    "sourceType": "salesforce",
    "workspaceId": "my-id",
    "configuration": {
        "auth_type": "Client",
        "client_id": "******",
        "client_secret": "**********",
        "refresh_token": "**********"
    }
}

The python lib likely has issues interpreting this and matching the right SourceResponse type. We should fix this in the API so that the response format matches, which would likely fix the python sdk behavior.