SeldonIO / MLServer

An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more
https://mlserver.readthedocs.io/en/latest/
Apache License 2.0
695 stars 179 forks source link

build: Lock GitHub runners' OS #1765

Closed jesse-c closed 4 months ago

jesse-c commented 4 months ago

This was motivated by our macOS jobs failing [2] because colima is missing. It looks like this is because the latest versions of the macOS runner no longer have colima installed by default [1].

colima is now explicitly installed. The incompatible --network-driver argument for this version of colima has been removed as well.

Since macOS isn't used for PRs, I had temporarily disabled that check to see the actions run. For seeing the results, you'll need to look at previous runs for this PR, before I re-enabled the PR check.

I think the way that poetry install is done in the workflow description and in the Tox environments seems wrong/duplicative. It ends up taking long as it's duplicated and you can see a mix of upgrades/downgrades done, that could be avoided.

[1] https://github.com/actions/runner-images/issues/6216 [2] /Users/runner/work/_temp/f19ffbff-27a9-4fc7-80b6-97791d2de141.sh: line 9: colima: command not found

CLAassistant commented 4 months ago

CLA assistant check
All committers have signed the CLA.

jesse-c commented 4 months ago

Going to merge in, based on the latest results [1]. It looks like the new Docker setup still needs some work, based on the error message [2].

[1] https://github.com/SeldonIO/MLServer/actions/runs/9302553271?pr=1765 [2] https://github.com/SeldonIO/MLServer/actions/runs/9302553271/job/25602995328?pr=1765


self = <AsyncRetrying object at 0x122fad6c0 (stop=<tenacity.stop.stop_after_attempt object at 0x122fadbd0>, wait=<tenacity.wa...bject at 0x122fe9600>, before=<function before_nothing at 0x122fc9090>, after=<function after_nothing at 0x122fca7a0>)>
retry_state = <RetryCallState 4926532304: attempt #20; slept for 71.0; last result: failed (KafkaConnectionError KafkaConnectionError: Unable to bootstrap from [('localhost', 49621, <AddressFamily.AF_UNSPEC: 0>)])>

    def iter(self, retry_state: "RetryCallState") -> t.Union[DoAttempt, DoSleep, t.Any]:  # noqa
        fut = retry_state.outcome
        if fut is None:
            if self.before is not None:
                self.before(retry_state)
            return DoAttempt()

        is_explicit_retry = fut.failed and isinstance(fut.exception(), TryAgain)
        if not (is_explicit_retry or self.retry(retry_state)):
            return fut.result()

        if self.after is not None:
            self.after(retry_state)

        self.statistics["delay_since_first_attempt"] = retry_state.seconds_since_start
        if self.stop(retry_state):
            if self.retry_error_callback:
                return self.retry_error_callback(retry_state)
            retry_exc = self.retry_error_cls(fut)
            if self.reraise:
                raise retry_exc.reraise()
>           raise retry_exc from fut.exception()
E           tenacity.RetryError: RetryError[<Future at 0x125b33dc0 state=finished raised KafkaConnectionError>]