man-group / ArcticDB

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.
http://arcticdb.io
Other
1.51k stars 93 forks source link

Fix publish result curl fail #1974

Closed phoebusm closed 2 weeks ago

phoebusm commented 2 weeks ago

Reference Issues/PRs

1971

What does this implement or fix?

Fix publish benchmark fails

Any other comments?

Test result: https://github.com/man-group/ArcticDB/actions/runs/11678805668 The git command fails due to the testing branch is not master. The S3 exception blocking step has been cleared.

The publish benchmark step fails due to SSL CA cert cannot be found by ArcticDB. It's because of https://github.com/man-group/ArcticDB/issues/1129. Instead of modifying the script to add CA certs to the connection strings, adding the softlink of the CA certs (ubuntu style -> rhel style) is a more staight forward option.

Ubuntu docker container is needed to be used as the native runner image blocks adding softlinks at the required location. Thus the new environment setup step.

Checklist

Checklist for code changes... - [ ] Have you updated the relevant docstrings, documentation and copyright notice? - [ ] Is this contribution tested against [all ArcticDB's features](../docs/mkdocs/docs/technical/contributing.md)? - [ ] Do all exceptions introduced raise appropriate [error messages](https://docs.arcticdb.io/error_messages/)? - [ ] Are API changes highlighted in the PR description? - [ ] Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?
phoebusm commented 2 weeks ago

Why isn't the automated certificate detection that you added in the s3_library_adapter.py working?

        if not self._ca_cert_path and not self._ca_cert_dir and platform.system() == "Linux":
            if ssl.get_default_verify_paths().cafile is not None:
                self._ca_cert_path = ssl.get_default_verify_paths().cafile
            if ssl.get_default_verify_paths().capath is not None:
                self._ca_cert_dir = ssl.get_default_verify_paths().capath

Currently real_s3... doesn't support automatically assign ca file and ca path. Adding that bit is easy but when I modified S3Bucket previously for SSL related test, I didn't expect it will be used for tests other in gh or moto. So setting ca directory is skipped on purpose: https://github.com/man-group/ArcticDB/blob/658278ebe5c1a2b9f5e0a619ac1246601150b548/python/arcticdb/storage_fixtures/s3.py#L72

I will need to revamp this bit or the delicate ssl tests will just collapsed. A ticket for tracking: https://github.com/man-group/ArcticDB/issues/1986