yasserfarouk / scml

ANAC Supply Chain Management League Development Environment
Other
9 stars 7 forks source link

Utility Function Calculation #45

Closed dipplestix closed 3 years ago

dipplestix commented 3 years ago

Describe the bug Utility function calculation doesn't match description in paper

To Reproduce My agent's state is: OneShotState(exogenous_input_quantity=6, exogenous_input_price=66, exogenous_output_quantity=0, exogenous_output_price=0, disposal_cost=0.010749528219285567, shortfall_penalty=0.47583515219370137, current_balance=2625.8255824262014) agent.awi.trading_prices = array([10., 17., 29.]) cagent.awi.catalog_prices = array([10, 14, 29])

agent.ufun([6, 0, 15]) = 18 agent.ufun([7, 0, 15]) = 13.368697784675417

Issues:

1) the agent.ufun.output_penalty_scale is set to the trading price of the input instead of the output agent.ufun.output_penalty_scale = cagent.ufun.input_penalty_scale = 10

2) If I take an agent in the second level, I get agent.ufun.output_penalty_scale = agent.ufun.input_penalty_scale = 14 which is the catalog price instead of the trading price.

3) agent.ufun([6, 0, 15]) - agent.ufun.shortfall_penaltyagent.ufun.output_penalty_scale - agent.ufun([7, 0, 15]) = 0 I think this should be equal to the production cost instead of 0 since we're charged $m_a Q^{out}$ not $m_a Q^{out}$ according to equation 5

yasserfarouk commented 3 years ago

Thank you for reporting this issue.

the agent.ufun.output_penalty_scale is set to the trading price of the input instead of the output

reproduced and resolved. It was a typo 😡 . It is almost equivalent of having a lower shortfall_penalty. You can download the new version from github with the bug fix (1b9040a6fc7d4e35e7bec92d9d9cd9bf10f57509).

pip install -U git+https://www.github.com/yasserfarouk/scml

This is a hot push and I did not run all tests so you may also need to update negmas the same way if you face any problems.

If I take an agent in the second level, I get agent.ufun.output_penalty_scale = agent.ufun.input_penalty_scale = 14 which is the catalog price instead of the trading price.

The ufun uses the trading price at the beginning of the step not at the end of it. We confirmed that this is the case. when we speak about tp(s) in the document we usually mean the trading price at the end of the simulation step so I will modify the description to be tp(s-1) instead of tp(s) in the ufun calculation. The rationale here is that the agent already knows tp(s-1) but not tp(s) while trading in step s.

You can check that the older trading_price is used not the catalog price by running the test test_ufun_penalty_scales_are_correct in tests/test_scml2021oneshot.py . Note that the trading price is initialized to the catalog price in the first step.

agent.ufun([6, 0, 15]) - agent.ufun.shortfall_penaltyagent.ufun.output_penalty_scale - agent.ufun([7, 0, 15]) = 0 I think this should be equal to the production cost instead of 0 since we're charged $m_a Q^{out}$ not $m_a Q^{out}$ according to equation 5

In this case the agent has only 6 inputs but agreed to sell 7. It will only produce 6 items not 7 so there is no extra production cost. *Q^{out} already takes that into account and will be 6* in both cases. That implies that in both cases the agent pays for production of 6 item. Right?

dipplestix commented 3 years ago

Great, thanks for the explanation/fix!