PathmindAI / pathmind-api

2 stars 1 forks source link

Policy Serving Error #10

Closed ejunprung closed 2 years ago

ejunprung commented 2 years ago

I'm seeing the below when testing policy server. I'm guessing it's because policy server outputs an integer whereas simulation.py expects a 2D array or something like that.

simulation = MouseAndCheese()
policy = Server(url="https://api.pathmind.com/policy/id17530", api_key="3f05f711-c792-4c5b-b8b5-76d41f717d21")
simulation.run(policy)
>>> simulation.run(policy)
+---------+------+----------------+-----------+-----------+--------+
| Episode | Step | observations_0 | actions_0 | rewards_0 | done_0 |
+---------+------+----------------+-----------+-----------+--------+
+---------+------+----------------+-----------+-----------+--------+
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ejunprung/Documents/GitHub/pathmind-api/pathmind/simulation.py", line 146, in run

  File "/Users/ejunprung/Documents/GitHub/pathmind-api/tests/examples/mouse/mouse_env_pathmind.py", line 37, in step
    else:
ValueError: Invalid action
slinlee commented 2 years ago

hmm i think i saw something like this earlier that I fixed. let me give it a try

slinlee commented 2 years ago

K. I debugged this. since we don't extract the observations, the policy server only has /predict_raw and we're hardcoding to '/predict'

For now I'll adjust it so that the user enters the whole url, including /predict.

Alex just added a pr to the webapp that will extract the observations though, so they'll be available for policy server.

slinlee commented 2 years ago

nvm, I can't just use /predict_raw because the observations are currently passed as a json dict. Switching it to an array would be a counter-productive hack.

I'll test Alex's PR on dev first.

slinlee commented 2 years ago

Testing the PR on dev you'll see that there are named observations now. Once I have a policy server live I'll test @ejunprung 's original issue

Screen Shot 2021-10-20 at 7 04 59 PM

:

maxpumperla commented 2 years ago

@slinlee @ejunprung we can also test this locally, at least to fix this bug. Note that, in any case, this can be "solved" by changing how the simulation in question reads out actions. But our goal should be to be consistent between Local and Server policies.

Btw, it'd be super helpful to have a test policy server for mouse and cheese up and running at all times, so that we can integrate it into unit tests, and more importantly to demo the API. That'd be quite cool for the README as well. It'd be essentially those 3 lines plus imports to showcase how running simulations works!

maxpumperla commented 2 years ago

@ejunprung I've tested the same thing like so:

In [8]: action = policy.get_actions(simulation)
In [9]: action[0]

Out[9]: array(None, dtype=object)

which means that we get a None value response from the Server. That's indeed not a valid action.

The good news is that when I run a Local policy against the same simulation, I get the same response type:

In [13]: local_policy = Local(model_file="./examples/mouse_model")
In [14]: local_policy.get_actions(simulation)
Out[14]: {0: array([1])}

So, ultimately this is a non-issue. We "just" need to fix the corresponding policy server.

maxpumperla commented 2 years ago

ok, I think I found it:

Out[38]: b'{"detail":[{"loc":["body","mouse_row_distance"],"msg":"field required","type":"value_error.missing"},{"loc":["body","mouse_col_distance"],"msg":"field required","type":"value_error.missing"}]}'

we have a discrepancy between features described in the schema.yaml of the policy server, which e.g. has mouse_row_distance, but in the local simulation here it's called distance_to_cheese_row.

maxpumperla commented 2 years ago

Closing this issue now, as it's a fluke. :D But I've opened #11 to make it easier to spot this kind of issue... after all we have all this fancy validation in policy server, so we might as well use it.

slinlee commented 2 years ago

Btw, it'd be super helpful to have a test policy server for mouse and cheese up and running at all times, so that we can integrate it into unit tests, and more importantly to demo the API. That'd be quite cool for the README as well. It'd be essentially those 3 lines plus imports to showcase how running simulations works!

@maxpumperla - https://github.com/SkymindIO/nativerl/blob/dev/nativerl/python/tests/test_policy_serving.py#L11-L18 This policy server was trained on the anylogic model which is why the observations are different in this mouse_env_pathmind:

https://github.com/SkymindIO/nativerl/blob/dev/nativerl/python/tests/mouse/mouse_env_pathmind.py#L39-L47

I'll have a mouse_and_cheese policy server up running that is trained from a Pathmind model, but right now they fail in training because it's expecting get_metrics()

maxpumperla commented 2 years ago

btw, nothing is stopping us from providing get_metrics. Just bc we've removed it from the interface, doesn't mean your sim can't still have it.

slinlee commented 2 years ago

It works now that we have (1) observations integrated in the webapp and (2) the updated environment.py in the webapp

+---------+------+----------------+-----------+-----------+--------+
| Episode | Step | observations_0 | actions_0 | rewards_0 | done_0 |
+---------+------+----------------+-----------+-----------+--------+
+---------+------+----------------+-----------+-----------+--------+
[0, 0, {'mouse_row': 0.0, 'mouse_col': 0.0, 'distance_to_cheese_row': 0.8, 'distance_to_cheese_col': 0.8, 'cheese_row': 0.8, 'cheese_col': 0.8}, array([0]), {'found_cheese': 0}, False]
[0, 1, {'mouse_row': 0.2, 'mouse_col': 0.0, 'distance_to_cheese_row': 0.6, 'distance_to_cheese_col': 0.8, 'cheese_row': 0.8, 'cheese_col': 0.8}, array([1]), {'found_cheese': 0}, False]
[0, 2, {'mouse_row': 0.2, 'mouse_col': 0.2, 'distance_to_cheese_row': 0.6, 'distance_to_cheese_col': 0.6, 'cheese_row': 0.8, 'cheese_col': 0.8}, array([0]), {'found_cheese': 0}, False]
[0, 3, {'mouse_row': 0.4, 'mouse_col': 0.2, 'distance_to_cheese_row': 0.4, 'distance_to_cheese_col': 0.6, 'cheese_row': 0.8, 'cheese_col': 0.8}, array([1]), {'found_cheese': 0}, False]
[0, 4, {'mouse_row': 0.4, 'mouse_col': 0.4, 'distance_to_cheese_row': 0.4, 'distance_to_cheese_col': 0.4, 'cheese_row': 0.8, 'cheese_col': 0.8}, array([0]), {'found_cheese': 0}, False]
[0, 5, {'mouse_row': 0.6, 'mouse_col': 0.4, 'distance_to_cheese_row': 0.2, 'distance_to_cheese_col': 0.4, 'cheese_row': 0.8, 'cheese_col': 0.8}, array([1]), {'found_cheese': 0}, False]
[0, 6, {'mouse_row': 0.6, 'mouse_col': 0.6, 'distance_to_cheese_row': 0.2, 'distance_to_cheese_col': 0.2, 'cheese_row': 0.8, 'cheese_col': 0.8}, array([1]), {'found_cheese': 0}, False]
[0, 7, {'mouse_row': 0.6, 'mouse_col': 0.8, 'distance_to_cheese_row': 0.2, 'distance_to_cheese_col': 0.0, 'cheese_row': 0.8, 'cheese_col': 0.8}, array([0]), {'found_cheese': 1}, True]
+---------+------+------------------------------------------------------------------------------------------------------------------------------------------+-----------+---------------------+--------+
| Episode | Step |                                                              observations_0                                                              | actions_0 |      rewards_0      | done_0 |
+---------+------+------------------------------------------------------------------------------------------------------------------------------------------+-----------+---------------------+--------+
|    0    |  0   | {'mouse_row': 0.0, 'mouse_col': 0.0, 'distance_to_cheese_row': 0.8, 'distance_to_cheese_col': 0.8, 'cheese_row': 0.8, 'cheese_col': 0.8} |    [0]    | {'found_cheese': 0} | False  |
|    0    |  1   | {'mouse_row': 0.2, 'mouse_col': 0.0, 'distance_to_cheese_row': 0.6, 'distance_to_cheese_col': 0.8, 'cheese_row': 0.8, 'cheese_col': 0.8} |    [1]    | {'found_cheese': 0} | False  |
|    0    |  2   | {'mouse_row': 0.2, 'mouse_col': 0.2, 'distance_to_cheese_row': 0.6, 'distance_to_cheese_col': 0.6, 'cheese_row': 0.8, 'cheese_col': 0.8} |    [0]    | {'found_cheese': 0} | False  |
|    0    |  3   | {'mouse_row': 0.4, 'mouse_col': 0.2, 'distance_to_cheese_row': 0.4, 'distance_to_cheese_col': 0.6, 'cheese_row': 0.8, 'cheese_col': 0.8} |    [1]    | {'found_cheese': 0} | False  |
|    0    |  4   | {'mouse_row': 0.4, 'mouse_col': 0.4, 'distance_to_cheese_row': 0.4, 'distance_to_cheese_col': 0.4, 'cheese_row': 0.8, 'cheese_col': 0.8} |    [0]    | {'found_cheese': 0} | False  |
|    0    |  5   | {'mouse_row': 0.6, 'mouse_col': 0.4, 'distance_to_cheese_row': 0.2, 'distance_to_cheese_col': 0.4, 'cheese_row': 0.8, 'cheese_col': 0.8} |    [1]    | {'found_cheese': 0} | False  |
|    0    |  6   | {'mouse_row': 0.6, 'mouse_col': 0.6, 'distance_to_cheese_row': 0.2, 'distance_to_cheese_col': 0.2, 'cheese_row': 0.8, 'cheese_col': 0.8} |    [1]    | {'found_cheese': 0} | False  |
|    0    |  7   | {'mouse_row': 0.6, 'mouse_col': 0.8, 'distance_to_cheese_row': 0.2, 'distance_to_cheese_col': 0.0, 'cheese_row': 0.8, 'cheese_col': 0.8} |    [0]    | {'found_cheese': 1} |  True  |
+---------+------+------------------------------------------------------------------------------------------------------------------------------------------+-----------+---------------------+--------+

Process finished with exit code 0