Closed martroben closed 3 months ago
Upon further investigation, the correct command to check whether OneLake certificate is Certificate Authority or End Entity certificate is this:
openssl s_client -connect onelake.blob.fabric.microsoft.com:443 -showcerts | openssl x509 -text | grep "Basic Constraints" -A 1
This returns CA:TRUE
, signifying that the certificate used is in fact a Certificate Authority cert, as the error suggests.
I'm not expecting Microsoft to alter their certificates to make it easier for people to use Polars in Fabric, so some workaround would still be appreciated.
Can anyone tell, what is the underlying module or crate that is giving the CaUsedAsEndEntity
error? Maybe some setting can be passed to skip the CA vs EE check. (Somehow the Spark-based delta.tables module doesn't seem to be bothered by CA cert used as EE cert.)
@martroben all storage options are passed to the "object store" crate
Posted an issue/question to object_store repo: https://github.com/apache/arrow-rs/issues/5696
Looks like object store bump up has caused this https://github.com/delta-io/delta-rs/pull/2311 . older versions of deltalake library
deltalake==0.16.2
this works fine
@hnasrullakhan please make an issue in arrow-rs repo then, there were zero code changes on our side.
there was a bump up in object-store version to 0.9.1 https://github.com/delta-io/delta-rs/pulls?page=2&q=is%3Apr+label%3Abinding%2Fpython+is%3Aclosed
@hnasrullakhan that's correct, but what I am saying is. this didn't require any changes on our side outside of bumping it. So you should make an issue upstream
Upon further testing, deltalake==0.16.1 works fine, but starting from 0.16.2, I'm getting the error (also tested the latest: 0.17.3).
I think @hnasrullakhan also confirms that - their earlier claim about 0.16.2 working fine was a typo.
I don't see any changes in object_store version between 0.16.1 and 0.16.2 (granted, I don't speak fluent rust).
Does anyone have any ideas, what else could have introduced this error between these two versions?
@martroben can you check against v0.18?
@ion-elgreco, I sure can, but on Monday, when I'm back in office.
what changed on v0.18 @ion-elgreco ?
Could still repro with v0.18
@ion-elgreco, I confirm @hnasrullakhan's position: the same issue still occurs, even with deltalake==0.18.0:
error trying to connect: invalid peer certificate: Other(CaUsedAsEndEntity)
As an aside - I'm trying to push a case with MS support in parallel. Their initial position was that since there is no problem with the older versions of deltalake, it's a 3rd party problem.
I suggested that the root cause is still their improper use of certificates - the 3rd parties might have just tightened the rules about what they find acceptable to work with. Not sure if I'll win this argument though, being a mere mortal.
What can men do against such reckless hate? - Théoden, son of Thengel
Apparently Polars is not the only downstream library where delta lake interactions broke around deltalake v0.17. The linked issue does not seem to be related to Fabric certificates however.
Nevertheless, for anyone looking, Daft might be a viable alternative to Polars soon - especially if they implement deletes and merges for delta lake.
Apparently Polars is not the only downstream library where delta lake interactions broke around deltalake v0.17. The linked issue does not seem to be related to Fabric certificates however.
Nevertheless, for anyone looking, Daft might be a viable alternative to Polars soon - especially if they implement deletes and merges for delta lake.
That issue is not really related, Daft is using our internal methods for their writer. When we make changes in our internal methods this is not marked as a breaking change :)
Thank you for the context @ion-elgreco. In that case it is indeed somewhat unfair for them to cite breaking changes in deltalake, when the issue is at least partly caused by their own misjudgment of what is exposed and what is not.
I'm still trying to understand though, what was the exact change between v0.16.1 and v.0.16.2 that changed the behaviour of SSL connections.
Apparently the problem is no longer present in v0.18.1.
Not sure what caused the fix between 0.18.0 and 0.18.1. If I had to guess, it might be bumping object store from 0.9 to 0.10 where object store updated their reqwest dependency. I guess we'll never know, but I'm nonetheless happy.
Microsoft is still using a self-signed CA certificate as EE certificate in OneLake connections from Fabric. However, I had a call with their support and the product team has apparently promise to do something with the certificate. Not sure, what though. Hopefully it will not break whatever caused the fix.
Additionally, I get the similar following error on 0.18.1: OSError: Generic MicrosoftAzure error: Error after 1 retries in 7824.991305s, max_retries:10, retry_timeout:180s, source:error sending request for url ...
When performing a write on a very large data table, should I make a new issue for this? @ion-elgreco
Microsoft deployed a new update to the notebook environment which should fixed this issue, could please give it another try. ( it may take some times to reach your particular region etc)
Environment
Delta-rs version: Python deltalake-0.17.1 Cloud provider: Microsoft (North Europe) Environment: Microsoft Fabric Notebook OS: Fabric VM (CBL-Mariner Linux)
Bug
What happened: While trying to create a Delta Table from a path in a Fabric Notebook, I'm getting the following error:
OSError: Generic MicrosoftAzure error: Error after 10 retries in 1.992440924s, max_retries:10, retry_timeout:180s, source:error sending request for url (https://onelake.blob.fabric.microsoft.com/<workspace id>/<lakehouse id>/Tables/some_table/_delta_log/_last_checkpoint): error trying to connect: invalid peer certificate: Other(CaUsedAsEndEntity)
What you expected to happen: To get a
deltalake.DeltaTable
instance without any errors.How to reproduce it: Run the following code in a Fabric Notebook:
More details:
storage_options["allow_invalid_certificates"] = "true"
can be used as a quickfix.Here are the certificate details fetched by
openssl s_client -showcerts -connect onelake.blob.fabric.microsoft.com:443
in the Fabric Notebook:It doesn't seem to be a Certificate Authority certificate. More like a self-signed certificate, so I don't know why the error is
CaUsedAsEndEntity
.Interestingly, the same
openssl
operation used to give a self signed certificate error (see this deltalake issue for details), but it seems that something has changed in theopenssl
setup of the underlying Fabric VMs.If anyone has any ideas for how to start solving this new issue (other than using the "allow_invalid_certificates"-hammer in perpetuity), I would be most thankful.