aimhubio / aim

Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
https://aimstack.io
Apache License 2.0
4.94k stars 299 forks source link

Random run initialization error #3057

Open Luca-DTU opened 7 months ago

Luca-DTU commented 7 months ago

🐛 Bug

Sometimes when setting up a run I get the following error "TypeError: DBAPIError.init() missing 2 required positional arguments: 'params' and 'orig'" this arises from calling run = Run(repo=repo, experiment=experiment, log_system_params=True)

To reproduce

It happens when starting multiple parallel runs on the same repository in ARGO. Out of 100 pods, maybe a couple will show this error.

Environment

aim==3.17.5 python:3.10 argo-workflows==6.3.5 hera==5.4.1

alberttorosyan commented 7 months ago

Hey @Luca-DTU! Could you please share the full error stack-trace? That would help to understand the root cause of the issue.

Luca-DTU commented 7 months ago

Hi, Here it is in one example:

Traceback (most recent call last): File "/argo/staging/script", line 17, in run = Run(repo=repo_name, experiment=experiment_name, log_system_params=True) File "/usr/local/lib/python3.10/site-packages/aim/ext/exception_resistant.py", line 70, in wrapper _SafeModeConfig.exception_callback(e, func) File "/usr/local/lib/python3.10/site-packages/aim/ext/exception_resistant.py", line 47, in reraise_exception raise e File "/usr/local/lib/python3.10/site-packages/aim/ext/exception_resistant.py", line 68, in wrapper return func(*args, *kwargs) File "/usr/local/lib/python3.10/site-packages/aim/sdk/run.py", line 828, in init super().init(run_hash, repo=repo, read_only=read_only, experiment=experiment, force_resume=force_resume) File "/usr/local/lib/python3.10/site-packages/aim/sdk/run.py", line 325, in init self.props File "/usr/local/lib/python3.10/site-packages/aim/sdk/run.py", line 440, in props self._props = self.repo.request_props(self.hash, self.read_only) File "/usr/local/lib/python3.10/site-packages/aim/sdk/repo.py", line 354, in request_props return StructuredRunProxy(self.client, hash, read_only, created_at) File "/usr/local/lib/python3.10/site-packages/aim/storage/structured/proxy.py", line 31, in init handler = self._rpc_client.get_resource_handler(self, self.resource_type, args=self.init_args) File "/usr/local/lib/python3.10/site-packages/aim/ext/transport/client.py", line 225, in get_resource_handler raise_exception(response.exception) File "/usr/local/lib/python3.10/site-packages/aim/ext/transport/message_utils.py", line 76, in raise_exception raise exception(args) if args else exception() TypeError: DBAPIError.init() missing 2 required positional arguments: 'params' and 'orig'