IQL setup for Custom Env

I've been working with a custom environment in my fork linked here: https://github.com/thomasychen/rm-MARLlib/blob/master/marllib/envs/base_env/buttons.py

In our script that currently works to run ippo here: https://github.com/thomasychen/rm-MARLlib/blob/master/good_scripts/automate_checkpoints.py, we wanted to shift to iql since we want to work with an off-policy algorithm. However, while the code runs when I just change all references of the ippo algo to iql, the documentation is a little unclear on how this affects learning and I'm unsure which of the jointq family is being executed in my implmentation, since the iql family involves different versions: https://marllib.readthedocs.io/en/latest/algorithm/jointQ_family.html#iql.

A brief explanation on how to run iql with this custom environment would be greatly appreciated! I did have to change ray.yaml "share_policy" to "all" to get it working.

Replicable-MARL / MARLlib

IQL setup for Custom Env #229