IBM / ibm-cos-sdk-python

ibm-cos-sdk-python
Apache License 2.0
46 stars 26 forks source link

Collaboration/Contribution for Great Expectations support of IBM Object Storage #53

Closed rdodev closed 1 year ago

rdodev commented 1 year ago

Hey folks 👋

We have some users of GX who have their data in IBM Cloud Object Store. GX supports AWS S3 via boto3; however, there are some differences we don't fully understand between what boto does and what ibm-boto does that's causing fatal issues when trying to list and retrieve objects from IBMCOS. We would like to see if you would be willing to engage with us collaborating/contributing for a solution to this issue. Thanks!

avinash1IBM commented 1 year ago

Hello @rdodev, Can you share more issue details like error details and sample code that you are trying which is causing the issue. We will take a look and will help you in fixing it.

rdodev commented 1 year ago

Hey @avinash1IBM

Thanks for prompt response. This issue is a good example of what users are experiencing.

avinash1IBM commented 1 year ago

Hello @rdodev , Thanks for sharing the issue. After going through the above issue, here are some of the observations/findings

  1. The user who reported the above clearly explained that GX code is constructing url on other cloud provider and is ignoring specified url provided. I see a mismatch here. If the user says that they want to connect to IBM COS instead of other cloud providers, the endpoint url that must be specified should be among these. So your code should either create an url based on the service provider or it should take user specified endpoint url into consideration.
rdodev commented 1 year ago

@avinash1IBM aside from URL construction, is there a difference between standard boto3 and the the fork included in this SDK?

avinash1IBM commented 1 year ago

There is one more difference and for this use case that user mentioned, it is not applicable. But this is the difference. IBM Cloud object storage has 2 types of credentials for authentication

  1. IAM Credentials(apiKey and serviceInstanceId)
  2. HMAC credentials(accessKey and SecretAccessKey) As long as a user uses hmac credentials to connect to IBM COS with other cloud provider's sdk, endpoint url is the only thing that needs to be checked, but if the user uses IAM Credentials, they need to use IBM botocore sdk for that purpose.
avinash1IBM commented 1 year ago

There is one more difference and for this use case that user mentioned, it is not applicable. But this is the difference. IBM Cloud object storage has 2 types of credentials for authentication

  1. IAM Credentials(apiKey and serviceInstanceId)
  2. HMAC credentials(accessKey and SecretAccessKey) As long as a user uses hmac credentials to connect to IBM COS with other cloud provider's sdk, endpoint url is the only thing that needs to be checked, but if the user uses IAM Credentials, they need to use IBM botocore sdk for that purpose.
rdodev commented 1 year ago

There is one more difference and for this use case that user mentioned, it is not applicable. But this is the difference. IBM Cloud object storage has 2 types of credentials for authentication

  1. IAM Credentials(apiKey and serviceInstanceId)
  2. HMAC credentials(accessKey and SecretAccessKey) As long as a user uses hmac credentials to connect to IBM COS with other cloud provider's sdk, endpoint url is the only thing that needs to be checked, but if the user uses IAM Credentials, they need to use IBM botocore sdk for that purpose.

Thanks for that. Appreciate the response. I'll go back to my team and present these answers.

avinash1IBM commented 1 year ago

Closing as resolved