HumanCompatibleAI / overcooked_ai

A benchmark environment for fully cooperative human-AI performance.
https://arxiv.org/abs/1910.05789
MIT License
707 stars 148 forks source link

Python native state visualization #53

Closed bmielnicki closed 3 years ago

bmielnicki commented 4 years ago

visualization2

More images generated by tool in notebook for test cases generation.

High level interface for StateVisualizer are mostly display_rendered_state and render_state.

StateVisualizer.display_rendered_state can output image as:

and returns path to image if image is displayed in ipython cell or saved in choosen path

Example usage: single visualization (speed is not the problem) so human can look at it: StateVisualizer().display_rendered_state(grid=grid, hud_data=hud_data, state=state, ipython_display=True)

visualization for deep RL from pixels:

visualizer = StateVisualizer(grid=grid)
pygame_surface = visualizer.render_state(state=state)
numpy_array_3d = pygame.surfarray.array3d(pygame_surface)

Below is config (with default values) that can be used when creating instance or to overwrite defaults (dafaults for whole class are configurable by StateVisualizer.configure_defaults(**configure_defaults_config).

{
        "height": None, # if None use grid_width - NOTE: can chop down hud if hud is wider than grid
        "width": None, # if None use (hud_height+grid_height)
        "tile_size": 75,
        "window_fps": 30,
        "player_colors": ['blue', 'green'],
        "is_rendering_hud": True,
        "hud_font_size": 10,
        "hud_font_path": roboto_path, 
        "hud_system_font_name": None, # if set to None use hud_font_path
        "hud_font_color": (255,255,255), # white
        "hud_data_default_key_order" : ["all_orders", "bonus_orders", "time_left", "score", "potential"],
        "hud_interline_size": 10,
        "hud_margin_bottom": 10,
        "hud_margin_top": 10,
        "hud_margin_left": 10,
        "hud_distance_between_orders": 5,
        "hud_order_size": 15,
        "is_rendering_cooking_timer": True,
        "show_timer_when_cooked": True,
        "cooking_timer_font_size": 20, # # if set to None use cooking_timer_font_path
        "cooking_timer_font_path": roboto_path, 
        "cooking_timer_system_font_name": None,
        "cooking_timer_font_color": (255, 0, 0), # red
        "grid": None,
        "background_color": (155, 101, 0) # color of empty counter
    }

What can be improved in this code:

  1. Text wrapping (now can be solved by setting width parameter so text won't be chopped off)
  2. Better optimization, i.e. hud and terrain caching, replacing pygame.init() with init of only used modules (and checking if using init when pygame is already inited takes enough time to care):
    • while optimization could be better currently it is probably good enough, especially without file saves:
      • with hud display (no file saving):
        • 100 visualizations (without saving and displaying a file) with initialized object before - 0.11s
        • 100 visualizations (without saving and displaying a file) with initialization before every render) - 0.27s
      • with hud display and file save:
        • 100 visualizations (without displaying a file) with initialized object before - 1s
        • 100 visualizations (without displaying a file) with initialization before every render) - 1.15s
      • without hud display (no file saving):
        • 100 visualizations (without saving and displaying a file) with initialized object before - 0.05s
        • 100 visualizations (without saving and displaying a file) with initialization before every render - 0.08s
      • without hud display, export to 3d np.array:
        • 100 visualizations (without saving and displaying a file) with initialized object before - 0.17s
        • 100 visualizations (without saving and displaying a file) with initialization before every render - 0.2s
    • summary:
      • saving file is takes 80-90% of the time (+1s per 100 visualizations)
      • export to np.array takes 70% of the time (0.12s per 100 visualizations)
      • hud_display takes +0.06s per 100 visualizations
      • initialization takes 0.03s per 100 visualizations, but maybe it is because pygame caches initialized fonts or something
        • anyway it is better to initialize object first before visualizing state hundreds of times in single trajectory for ml training
  3. Write wrapper to creating surface that fills it with background color after creation (probably reduces number lines of code by 4 and adds 3 new lines).

I'm also not sure if my approach in init is right. It works correctly, it is easy to add new variable. On the other hand my lint screams stuff like "Instance of 'StateVisualizer' has no 'is_rendering_hud' memberpylint(no-member)". Alternative is declaring by hand 25 instance variables (probably with None value - they would be updated anyway). Also usage kwargs makes IDE support for init parameters non-existent. Eliminating kwargs takes even more repetitive lines of code than eliminating lint errors.

I've also deleted visualization test from osx tests - test fail in one place - probably because fonts are displaying a little different than in linux (it was only 4 pixel difference when displaying hud text so nothing serious).

codecov[bot] commented 4 years ago

Codecov Report

Merging #53 (14459a5) into master (242c24d) will decrease coverage by 13.48%. The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           master      #53       +/-   ##
===========================================
- Coverage   82.70%   69.21%   -13.49%     
===========================================
  Files          10       13        +3     
  Lines        3035     3651      +616     
===========================================
+ Hits         2510     2527       +17     
- Misses        525     1124      +599     
Flag Coverage Δ
no-planners 69.21% <ø> (-13.49%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
overcooked_ai_py/data/planners/__init__.py 57.14% <0.00%> (-42.86%) :arrow_down:
overcooked_ai_py/planning/planners.py 46.66% <0.00%> (-39.18%) :arrow_down:
overcooked_ai_py/agents/benchmarking.py 50.44% <0.00%> (-15.85%) :arrow_down:
overcooked_ai_py/agents/agent.py 56.90% <0.00%> (-15.41%) :arrow_down:
overcooked_ai_py/mdp/overcooked_mdp.py 87.68% <0.00%> (-5.14%) :arrow_down:
overcooked_ai_py/mdp/overcooked_env.py 65.19% <0.00%> (-4.10%) :arrow_down:
overcooked_ai_py/static.py 100.00% <0.00%> (ø)
overcooked_ai_py/planning/search.py 52.17% <0.00%> (ø)
...rcooked_ai_py/visualization/visualization_utils.py 66.66% <0.00%> (ø)
overcooked_ai_py/visualization/pygame_utils.py 69.01% <0.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 242c24d...6f67fa0. Read the comment docs.

bmielnicki commented 4 years ago

Updated several things:

bmielnicki commented 4 years ago

@nathan-miller23 Can I click the "Merge pull request" button or do you want to look at the code before?

micahcarroll commented 4 years ago

@mesutyang97 if possible maybe you could give a look at this PR too before we merge it?

micahcarroll commented 4 years ago

Also Bartek, do you have an intuition why the code coverage is reduced by this PR (I'm still getting used to how those numbers are calculated)?

bmielnicki commented 4 years ago

Also Bartek, do you have an intuition why the code coverage is reduced by this PR (I'm still getting used to how those numbers are calculated)?

I have no idea. Most of the lowered coverage comes from planners file that this PR is not changing at all. For more info see this

bmielnicki commented 4 years ago

Added visualization of trajectory https://github.com/HumanCompatibleAI/overcooked_ai/issues/63 and static methods for default creation of hud data. There are also other minor improvements to existing code. trajectory_visualization

mesutyang97 commented 4 years ago

Hey @bmielnicki , thanks again for the hard work! I was testing visualization code you have on randomly generated layouts, and noticed the problem of agents's location being out of sync with where they actually are in the game. I am attaching two subsequent frames from what I ran in the visualization:

Screen Shot 2020-09-28 at 4 58 45 PM Screen Shot 2020-09-28 at 4 58 52 PM

It looks like a rendering issue, because the agent was able to pick up the plate facing the bottom dish dispensor, so we know agent isn't supposed to be where it is rendered.

Here is the code I ran for you to reproduce the issues:

from overcooked_ai_py.mdp.layout_generator import LayoutGenerator
# testing some states hoping it can find unexpected bugs
various_tests = []
various_results = []

config = copy.deepcopy(DEFAULT_VALUES)
config["tile_size"] = 45
config["cooking_timer_font_size"] = 15

DEFAULT_MDP_GEN_PARAMS = {
    "inner_shape": (7, 5),
    "prop_empty": 0.7,
    "prop_feats": 0.4,
    "start_all_orders" : [
        { "ingredients" : ["onion", "onion", "onion"]}
    ],
    "recipe_values" : [20],
    "recipe_times" : [20],
    "display": False
}

mdp_fn = LayoutGenerator.mdp_gen_fn_from_dict(DEFAULT_MDP_GEN_PARAMS, outer_shape=(7, 5))

agent_eval = AgentEvaluator({"horizon": 1001}, mdp_fn)
grid = agent_eval.env.mdp.terrain_mtx
print(grid)
trajectory_random_pair = agent_eval.evaluate_random_pair(num_games=1, display=False)
for i in range(0, 30):
    state = trajectory_random_pair["ep_states"][0][i]
    kwargs = {"hud_data": {}, "grid":grid, "state":state.to_dict()}
    test_dict = {"config": config, "kwargs": kwargs, 
     "comment": "Various tests",
     "result_array_filename": "test_various_display_%d.npy"%i}
    print("test_various_display_%i dict"%i)
    test_array = display_and_export_to_array(test_dict)
    various_tests.append(test_dict)
    various_results.append(test_array)

Let me know if you would like me to help you identify the cause. Thanks again for you help!

bmielnicki commented 4 years ago

The solution for the issue reported by @mesutyang97 is now pushed along with the additional test of visualizations of generated layouts.

bmielnicki commented 4 years ago

Added feature requested by @mesutyang97

  1. Added visualization of action probs. Action probs need to be supplied along with state and they can come from agent used in trajectory or any agent. It's not pretty (especially when visualizing probs of 2 agents), but does its job. If you have an idea how to improve visuals let me know.

  2. Added SampleAgent class (let me know if you have an idea for a better name) that has multiple agents inside and sample action using an average of action probs from contained agents. It is equivalent to sampling an agent and then sampling the action from that agent. This agent can be used for counterfactual prediction of action in a given state.

Example use case of the code mentioned above:

mdp_gen_params = {"layout_name": 'cramped_room'}
mdp_fn = LayoutGenerator.mdp_gen_fn_from_dict(mdp_gen_params)
env_params = {"horizon": 10}
agent_eval = AgentEvaluator(env_params=env_params, mdp_fn=mdp_fn)

trajectory = agent_eval.evaluate_human_model_pair(num_games=1)
state = trajectory["ep_states"][0][5]
grid = trajectory["mdp_params"][0]["terrain"]
greedy = GreedyHumanModel(mlam=agent_eval.env.mlam)
greedy.set_agent_index(0)
a1 = greedy.action(state)[1]["action_probs"]
greedy.set_agent_index(1)
a2 = greedy.action(state)[1]["action_probs"]
action_probs = [a1, a2]
agent1 = SampleAgent([GreedyHumanModel(mlam=agent_eval.env.mlam),
                      RandomAgent(all_actions=True),
                      RandomAgent(all_actions=False)])
agent1.agents[0].set_agent_index(0)
agent2 = SampleAgent([GreedyHumanModel(mlam=agent_eval.env.mlam),
                      RandomAgent(all_actions=True),
                      RandomAgent(all_actions=False)])
agent2.agents[0].set_agent_index(1)

action_probs = [agent1.action(state)[1]["action_probs"], None]

StateVisualizer(grid=grid).display_rendered_state(state, ipython_display=True, action_probs=action_probs)
action_probs = [agent1.action(state)[1]["action_probs"], agent2.action(state)[1]["action_probs"]]
StateVisualizer(grid=grid).display_rendered_state(state, ipython_display=True, action_probs=action_probs)

action_prob_visualizations

action_probs_traj = []
for state in trajectory["ep_states"][0]:
    action_probs_traj.append([agent1.action(state)[1]["action_probs"], agent2.action(state)[1]["action_probs"]])
StateVisualizer(grid=grid).display_rendered_trajectory(trajectory, ipython_display=True, action_probs=action_probs_traj)

action_probs_trajectory

mesutyang97 commented 4 years ago

Looks great! Thanks for all the hardwork doing the visualization!!! Let's see what @nathan-miller23 say, but I think this should be merged asap.

micahcarroll commented 3 years ago

Was going to merge but after pulling from master, tests started failing. Not entirely sure why, but seems like the errors involve color values being off by one (if I'm interpreting the logs correctly)?

Tried re-generating the visualization tests data from the ipynb but that also didn't seem to help. If we fix the tests, I'll merge after that!

bmielnicki commented 3 years ago

Was going to merge but after pulling from master, tests started failing. Not entirely sure why, but seems like the errors involve color values being off by one (if I'm interpreting the logs correctly)?

Tried re-generating the visualization tests data from the ipynb but that also didn't seem to help. If we fix the tests, I'll merge after that!

There are slight differences in rendering between operating systems (between different Linux distros and between OSX and Linux). The only solution that I've found is to turn off the asserts (leave only printouts for manual checks).

micahcarroll commented 3 years ago

Ok, thanks so much Bartek! I'll go ahead and merge!