[RLlib]: SimpleQ TF2 is broken

What happened + What you expected to happen

Something's broken with the SimpleQ TF2 action distribution, but I can't track down the bug. This doesn't happen with TF1/Torch.

Versions / Dependencies

ray 3.0.0dev0 (master) Ubuntu 20.04 tensorflow 2.7.0 (pypi)

Reproduction script

from ray.rllib.algorithms.simple_q import SimpleQConfig

config = (
    SimpleQConfig()
    .environment(env="CartPole-v0")
    .rollouts(num_rollout_workers=0)
    .framework("tf2")
    .exploration(exploration_config={"type": "SoftQ", "temperature": 1.0})
)

algo = config.build()
policy = algo.get_policy()
batch = algo.workers.local_worker().sample()
log_likelihoods = policy.compute_log_likelihoods(batch["actions"], batch["obs"])

Issue Severity

Medium: It is a significant difficulty but I can work around it.

ray-project / ray

[RLlib]: SimpleQ TF2 is broken #26192

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity