Closed nickjalbert closed 2 years ago
@andyk let me know if this is an expected issue or not!
@andyk unfortunately, still seeing this issue.
On a clean checkout of master of AOS, I start the webserver (confirmed I can load localhost:8000) and run the following (Ubuntu on WSL2):
$ rm -rf mlruns && rm -rf output.txt && rm -rf documentation/demos/mlruns
$ git rev-parse HEAD
593c3a20fa0499ba55049942ea0a08067d89bb82
$ bash documentation/demos/demo_ilya_papag_from_cli.sh
and once it runs, on the webserver tab I see the following error:
Not Found: /api/v1/runs/656fdba774124ee4a76bf05b89efdb54/
[19/Apr/2022 10:46:33] "GET /api/v1/runs/656fdba774124ee4a76bf05b89efdb54/ HTTP/1.1" 404 23
run_command: None
{"metrics": {"mean_reward": -20.166666666666668, "episode_count": 24.0, "median_reward": -20.0, "training_step_count": 21474.0, "step_count": 21474.0, "min_reward": -21.0, "training_episode_count": 24.0, "max_reward": -16.0}, "params": {}, "tags": {"run_type": "learn", "pcs.is_agent_run": "True", "papag_agent_run": "True", "mlflow.runName": "AgentOS learn with Agent 'agent==593c3a20fa0499ba55049942ea0a08067d89bb82' and Env 'agent==593c3a20fa0499ba55049942ea0a08067d89bb82'", "environment_identifier": "agent==593c3a20fa0499ba55049942ea0a08067d89bb82", "agent_identifier": "agent==593c3a20fa0499ba55049942ea0a08067d89bb82", "mlflow.source.git.commit": "964193814522f8df7288379fc1b0741985da5ba8", "pcs.is_run": "True", "mlflow.source.name": "/home/nickj/agentos/lean-env/bin/agentos", "mlflow.user": "nickj", "mlflow.parentRunId": "2d69f36c740a439d852fe506c7babbe0", "mlflow.source.type": "LOCAL"}}
tags: {'run_type': 'learn', 'pcs.is_agent_run': 'True', 'papag_agent_run': 'True', 'mlflow.runName': "AgentOS learn with Agent 'agent==593c3a20fa0499ba55049942ea0a08067d89bb82' and Env 'agent==593c3a20fa0499ba55049942ea0a08067d89bb82'", 'environment_identifier': 'agent==593c3a20fa0499ba55049942ea0a08067d89bb82', 'agent_identifier': 'agent==593c3a20fa0499ba55049942ea0a08067d89bb82', 'mlflow.source.git.commit': '964193814522f8df7288379fc1b0741985da5ba8', 'pcs.is_run': 'True', 'mlflow.source.name': '/home/nickj/agentos/lean-env/bin/agentos', 'mlflow.user': 'nickj', 'mlflow.parentRunId': '2d69f36c740a439d852fe506c7babbe0', 'mlflow.source.type': 'LOCAL'}
agent_id: agent==593c3a20fa0499ba55049942ea0a08067d89bb82
env_id: agent==593c3a20fa0499ba55049942ea0a08067d89bb82
Internal Server Error: /api/v1/runs/
Traceback (most recent call last):
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/models/query.py", line 581, in get_or_create
return self.get(**kwargs), False
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/models/query.py", line 435, in get
raise self.model.DoesNotExist(
registry.models.Component.DoesNotExist: Component matching query does not exist.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
psycopg2.errors.NotNullViolation: null value in column "instantiate" violates not-null constraint
DETAIL: Failing row contains (2022-04-19 10:46:33.173744+00, 2022-04-19 10:46:33.173765+00, agent==593c3a20fa0499ba55049942ea0a08067d89bb82, , , , , null, null).
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/core/handlers/exception.py", line 47, in inner
response = get_response(request)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/core/handlers/base.py", line 181, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/views/decorators/csrf.py", line 54, in wrapped_view
return view_func(*args, **kwargs)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/rest_framework/viewsets.py", line 125, in view
return self.dispatch(request, *args, **kwargs)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/rest_framework/views.py", line 509, in dispatch
response = self.handle_exception(exc)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/rest_framework/views.py", line 469, in handle_exception
self.raise_uncaught_exception(exc)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/rest_framework/views.py", line 480, in raise_uncaught_exception
raise exc
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/rest_framework/views.py", line 506, in dispatch
response = handler(request, *args, **kwargs)
File "/usr/lib/python3.9/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/home/nickj/clean/agentos/web/registry/views.py", line 93, in create
run = Run.create_from_request_data(request.data)
File "/home/nickj/clean/agentos/web/registry/models.py", line 351, in create_from_request_data
agent_comp, agent_comp_created = Component.objects.get_or_create(
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/models/query.py", line 588, in get_or_create
return self.create(**params), True
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/models/query.py", line 453, in create
obj.save(force_insert=True, using=self.db)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/models/base.py", line 739, in save
self.save_base(using=using, force_insert=force_insert,
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/models/base.py", line 776, in save_base
updated = self._save_table(
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/models/base.py", line 881, in _save_table
results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/models/base.py", line 919, in _do_insert
return manager._insert(
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/models/query.py", line 1270, in _insert
return query.get_compiler(using=using).execute_sql(returning_fields)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1416, in execute_sql
cursor.execute(sql, params)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/backends/utils.py", line 98, in execute
return super().execute(sql, params)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/home/nickj/agentos/lean-env/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
django.db.utils.IntegrityError: null value in column "instantiate" violates not-null constraint
DETAIL: Failing row contains (2022-04-19 10:46:33.173744+00, 2022-04-19 10:46:33.173765+00, agent==593c3a20fa0499ba55049942ea0a08067d89bb82, , , , , null, null).
[19/Apr/2022 10:46:33] "POST /api/v1/runs/ HTTP/1.1" 500 205797
From the /tmp/papag-components.yaml
:
$ cat /tmp/papag-components.yaml
WARNING: version was passed into get_local_path() on a LocalRepo, which means it is being ignored. If this is actually a versioned repo, use GithubRepo or another versioned Repo type.
WARNING: version was passed into get_local_path() on a LocalRepo, which means it is being ignored. If this is actually a versioned repo, use GithubRepo or another versioned Repo type.
WARNING: version was passed into get_local_path() on a LocalRepo, which means it is being ignored. If this is actually a versioned repo, use GithubRepo or another versioned Repo type.
components:
PAPAGRun==593c3a20fa0499ba55049942ea0a08067d89bb82:
class_name: PAPAGRun
dependencies: {}
file_path: example_agents/papag/papag_run.py
instantiate: false
repo: papag_agent_dir
agent==593c3a20fa0499ba55049942ea0a08067d89bb82:
class_name: PAPAGAgent
dependencies:
PAPAGRun: PAPAGRun==593c3a20fa0499ba55049942ea0a08067d89bb82
file_path: example_agents/papag/agent.py
instantiate: true
repo: papag_agent_dir
requirements_path: example_agents/papag/requirements.txt
registries: []
repos:
papag_agent_dir:
type: github
url: https://github.com/agentos-project/agentos.git
run_commands: {}
runs: {}
Interestingly, the WARNING:
is valid yaml and doesn't seem to cause problems with the reg file. Finally,
$ cat output.txt
... # Lots of debugging output
Updates 620, num timesteps 24840, FPS 516
Last 10 training episodes: mean/median reward -19.8/-20.0, min/max reward -21.0/-16.0
Results for AgentRun 656fdba774124ee4a76bf05b89efdb54
Training results over 24 episodes:
Overall agent was trained on 21474 transitions over 24 episodes
Max reward over 24 episodes: -16.0
Mean reward over 24 episodes: -20.166666666666668
Median reward over 24 episodes: -20.0
Min reward over 24 episodes: -21.0
Run 2d69f36c740a439d852fe506c7babbe0 recorded. Execute the following for details:
agentos status 2d69f36c740a439d852fe506c7babbe0
It looked you were able to get it to work on your machine. Any special sauce that I might be missing?
Maybe I messed up the branch merging or something? Or maybe documentation/demos/demo_ilya_papag_from_cli.sh
isn't expected to run right out of the box yet? Let me know if I'm doing something obviously dumb! :P
🤔 I'll look more today. and try on my Windows box too.
Just another observation as I try to reproduce per your instructions: on my M1, I'm using conda and conda install scipy
. I doubt that's related, but documenting it just in case.
We debugged this together. The problem was my environment. Fresh install of requirements (specifically pip install -e .
) made it work for me!
I bet I installed the web requirements independently at some point (pip install -r web/requirements.txt
) and got a fixed version of PCS/AOS. There's probably a better way to install AOS for web...
@nickjalbert I think we decided on the phone that this issue might have been fixed in 97cb6d446827fdd56ad7f138213f1ce9efe17151 -- merged as part of #350.
Can you confirm?
I'll check again tomorrow that all weirdness has resolved with the env fix, but for now I'll close because I'm pretty confident it'll be all good.
Working 💯 💯 locally!
Not sure if this is a known issue or not, here's what I'm seeing:
The server blows up with the following:
Proximal cause is when the Component doesn't exist, the
get_or_create
here doesn't have enough args to actually create a valid Component. Maybe I need to pass a recurse flag or something?