airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.87k stars 4.07k forks source link

Source S3: connector endpoint #23715

Closed davira closed 1 year ago

davira commented 1 year ago
## Environment - **Airbyte version**: 0.41.0 - **OS Version / Instance**: Ubuntu 22.04, Huawei Cloud ECS - **Deployment**: Docker - **Source Connector and version**: S3 connector - **Destination Connector and version**: (if applicable example Postgres 0.3.3)

Current Behavior

Configuring S3 connector (supporting S3 open source), cannot connect to Huawei Cloud OBS (supporting S3). Problem with the endpoint configuration. Other open source tools supporting S3 open protocol, can support OBS

Expected Behavior

Connect properly

Logs

File "/usr/local/lib/python3.9/site-packages/boto3/session.py", line 299, in client return self._session.create_client( File "/usr/local/lib/python3.9/site-packages/botocore/session.py", line 976, in create_client client = client_creator.create_client( File "/usr/local/lib/python3.9/site-packages/botocore/client.py", line 155, in create_client client_args = self._get_client_args( File "/usr/local/lib/python3.9/site-packages/botocore/client.py", line 485, in _get_client_args return args_creator.get_client_args( File "/usr/local/lib/python3.9/site-packages/botocore/args.py", line 129, in get_client_args endpoint = endpoint_creator.create_endpoint( File "/usr/local/lib/python3.9/site-packages/botocore/endpoint.py", line 402, in create_endpoint raise ValueError("Invalid endpoint: %s" % endpoint_url) ValueError: Invalid endpoint: obs.ap-southeast-2.myhuaweicloud.com 2023-03-03 06:58:58 ERROR i.a.w.i.DefaultAirbyteStreamFactory(internalLog):163 - Check failed 2023-03-03 06:58:59 INFO i.a.w.g.DefaultCheckConnectionWorker(run):114 - Check connection job received output: io.airbyte.config.StandardCheckConnectionOutput@614d5571[status=failed,message=ValueError('Invalid endpoint: obs.ap-southeast-2.myhuaweicloud.com')] 2023-03-03 06:58:59 INFO i.a.c.i.LineGobbler(voidCall):114 - 2023-03-03 06:58:59 INFO i.a.w.t.TemporalAttemptExecution(get):163 - Stopping cancellation check scheduling... 2023-03-03 06:58:59 INFO i.a.c.i.LineGobbler(voidCall):114 - ----- END CHECK ----- 2023-03-03 06:58:59 INFO i.a.c.i.LineGobbler(voidCall):114 -

Steps to Reproduce

  1. Configure S3 connector to Huawei Cloud OBS
  2. Endpoint fails

Are you willing to submit a PR?

Remove this with your answer.

sajarin commented 1 year ago

Hey @davira

I don't know if the s3 connector supports OBS. According to the stacktrace, it seems to be failing in the botocore library that is underlying the s3 connector: https://github.com/boto/botocore/blob/b66f24b45cf2adca2f6cbf8944bbb4eafb14e48b/botocore/utils.py#L1230

davira commented 1 year ago

Hi @sajarin , thanks for the answer. The endpoint is: obs.ap-southeast-2.myhuaweicloud.com. Is not an invalid endpoint ... Is also interesing, since, if I use the S3 destination, seems to work

tolik0 commented 1 year ago

The S3 connector does not support Huawei Cloud OBS. The reason is that the S3 connector relies on the boto3 library, which is incompatible with Huawei OBS. In contrast, destination-s3 has a different underlying architecture.