ckrk / bidding_learning

Implementations of the Deep Q-Learning Algorithms for Auctions
MIT License
13 stars 6 forks source link

Bidding-Learning

Version 1.0.4 Forked to rangL competition branch

Version 1.0.3 Forked to Single Player branch

Version 1.0.2 - Change log

Version 1.0.1 - Change log

Implementations of the Deep Q-Learning Algorithms for Auctions.

What should it do?

Pay attention that the algorithm involves randomness

The algorithm is seeded and delivers reproducible results with the same seed. Nontheless, the algorithm intrinsically uses randomness for exploration and different runs will differ randomly. Sometimes you will get strange results. Hence, try to rerun the algorithm 2-3 times in varying settings if you feel you get non-sensical values to be sure it repeatedly fails. Sometimes you just get unlucky.

How to run?

Requirements

Citing

If you use our algorithm in your work, please cite the accompanying paper:

@article{graf2021computational,
      title={{Computational Performance of Deep Reinforcement Learning to Find Nash Equilibria}}, 
      author={Christoph Graf and Viktor Zobernig and Johannes Schmidt and Claude Kl\"ockl},
      year={2024},
      journal={Computational Economics},
      volume=63,
      pages=529--576
}

How to customize a run of the algorithm?

Environment Parameters

The following parameters can be defined by the user by specifying them as inputs to the Environment in environment_bid_market.py. This is usually done via main.py but can be done directly.

EnvironmentBidMarket(capacities = capacities, costs = costs, demand =[5,6], agents = 1, fringe = 1, past_action= 1, lr_actor = 1e-4, lr_critic = 1e-3, normalization = 'none', reward_scaling = 1, action_limits = [-1,1], rounds_per_episode = 1)

The output mode is hardcoded in the function render belonging to EnvironmentBidMarket

Test Parameters

The noise model and its variants is hard-coded in main.py. There is:

Network Architecture

The architecture of the actor and critic netowrks are hardcoded in actor_crtic.py

Dependency Structure: