Closed bmielnicki closed 2 years ago
Merging #57 (27d16be) into master (b0d6997) will increase coverage by
3.12%
. The diff coverage isn/a
.
@@ Coverage Diff @@
## master #57 +/- ##
==========================================
+ Coverage 80.76% 83.88% +3.12%
==========================================
Files 10 10
Lines 3077 3313 +236
==========================================
+ Hits 2485 2779 +294
+ Misses 592 534 -58
Flag | Coverage Δ | |
---|---|---|
no-planners | 83.88% <ø> (+3.12%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
overcooked_ai_py/mdp/layout_generator.py | 82.40% <0.00%> (+0.05%) |
:arrow_up: |
overcooked_ai_py/mdp/overcooked_env.py | 69.62% <0.00%> (+0.57%) |
:arrow_up: |
overcooked_ai_py/planning/planners.py | 86.19% <0.00%> (+0.70%) |
:arrow_up: |
overcooked_ai_py/mdp/overcooked_mdp.py | 93.58% <0.00%> (+1.58%) |
:arrow_up: |
overcooked_ai_py/agents/agent.py | 72.31% <0.00%> (+4.11%) |
:arrow_up: |
overcooked_ai_py/agents/benchmarking.py | 65.19% <0.00%> (+11.68%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update b0d6997...6b754fb. Read the comment docs.
Added pull request for overcooked-demo https://github.com/HumanCompatibleAI/overcooked-demo/pull/23. Merge this PR only along with changes in overcooked-demo (overcooked-demo PR can be merged first at it is compatible with current master).
New changes:
Given that this is a lower priority issue right now, I'll temporarily close this PR for bookeeping – we can re-open it in the future if we are interested in exploring this direction further.
List of changes:
Orders are made separate class from recipes, so they can have separate information that does not make sense for recipes (like expire time, expire penalty, more complex calculation of rewards, etc.) These changes allow having orders that:
Orders are parts of OrdersList ("list" in the name is for actual list, not for python class, although is currently implemented using python list).
not None time_to_expire indicates that order is temporary and will disappear after some time giving negative reward (expire_penalty).
Calculating possible rewards and adjectives to potting in get_recipe_value changed a bit and now is difficult to make it perfect. Currently, the usage of adjectives does not see the possibility of new recipes appearing.
To overcooked env/gridworld there is added "sparse_env_rewards" (reward from env (now only punishments for expired recipes fall into this category)) and "sparse_rewards_sum" (sum of sparse_reward_by_agent and sparse_env_rewards) is same places "sparse_reward_by_agent" are appearing now.
get_recipe_value and get_optimal_possible_recipe has some changes to work with orders that can expire.
As the next step, I will change the overcooked-demo (and then python state visualizations as they are not merged yet) to represent visually temporary orders (e.g. add info about time before expiring of orders). It's better to wait with merging this PR to master to the moment change in overcooked-demo will be made and also reviewed.