[BUG] citylearn.reward_function.MARL

Hi, I think there is a bug in citylearn.reward_function.MARL.

Issue Description

Please provide a brief description of the issue. When I use MARL reward function in file citylearn.reward_function as my reward function, bug appears at line 64 of this file.

Expected Behavior

It should give the maximum number between 0 and _district_electricityconsumption.

Actual Behavior

TypeError: 'numpy.float64' object cannot be interpreted as an integer

Steps to Reproduce

Just replace the reward function to MARL, and then the bug appears.

Environment

CityLearn version: 1.8.0
Operating System: Ubuntu 22.04
Python version: 3.8.0

Possible Solution

Simply add parentheses inside the np.nanmax.

reward = np.sign(building_electricity_consumption)*0.01*building_electricity_consumption**2*np.nanmax((0, district_electricity_consumption))

intelligent-environments-lab / CityLearn