flow-project / flow

Computational framework for reinforcement learning in traffic control
MIT License
1.06k stars 375 forks source link

RL traffic lights in sumo imported network #537

Open DMarcelAM opened 5 years ago

DMarcelAM commented 5 years ago

Hello, I have imported a sumo network file (net.xml) and I want to train RL traffic lights in this scenario. In order to do this I have create a custom scenario (with the routes) and a custom enviroment (based on green_wave_env) to train the RL traffic lights. But I am getting some errors.

Here is the code I used (based on green_wave example):

HORIZON = 200

N_ROLLOUTS = 20

N_CPUS = 2

inflow = InFlows()

inflow.add( veh_type="human",
            edge="S_N",
            probability=0.03125,
            departLane="0",
            departSpeed=1)

inflow.add( veh_type="human",
            edge="N_S",
            probability=0.03125,
            departLane="0",
            departSpeed=1)

inflow.add( veh_type="human",
            edge="W_E",
            probability=0.09,
            departLane="0",
            departSpeed=1)

inflow.add( veh_type="human",
            edge="E_W",
            probability=0.03125,
            departLane="0",
            departSpeed=1)

env_params = EnvParams()
sim_params = SumoParams(render=True,sim_step=1)
initial_config = InitialConfig()
vehicles = VehicleParams()
vehicles.add('human', num_vehicles=0)

net_params = NetParams(inflows=inflow,
    template='/home/lsi04/flow/tutorials/networks/grid.net.xml',
    no_internal_links=False)

additional_env_params = {
        'target_velocity': 50,
        'switch_time': 3.0,
        'num_observed': 2,
        'discrete': False,
        'tl_type': 'controlled'  }

initial_config=InitialConfig()

flow_params = dict(

    exp_tag='green_wave',

    env_name='JavierPradoGridEnv',

    scenario='JavierPradoScenario',

    simulator='traci',

    sim=SumoParams(
        sim_step=1,
        render=False, ),

    env=EnvParams(
        horizon=HORIZON,
        additional_params=additional_env_params, ),

    net=net_params,

    veh=vehicles,

    initial=initial_config,
)

def setup_exps():

    alg_run = 'PPO'

    agent_cls = get_agent_class(alg_run)
    config = agent_cls._default_config.copy()
    config['num_workers'] = N_CPUS
    config['train_batch_size'] = HORIZON * N_ROLLOUTS
    config['gamma'] = 0.999  # discount rate
    config['model'].update({'fcnet_hiddens': [32, 32]})
    config['use_gae'] = True
    config['lambda'] = 0.97
    config['kl_target'] = 0.02
    config['num_sgd_iter'] = 10
    config['clip_actions'] = False  # FIXME(ev) temporary ray bug
    config['horizon'] = HORIZON

    flow_json = json.dumps(
        flow_params, cls=FlowParamsEncoder, sort_keys=True, indent=4)
    config['env_config']['flow_params'] = flow_json
    config['env_config']['run'] = alg_run

    create_env, gym_name = make_create_env(params=flow_params, version=0)

    register_env(gym_name, create_env)

    return alg_run, gym_name, config

alg_run, gym_name, config = setup_exps()

ray.init(num_cpus=N_CPUS + 1, redirect_output=False)
trials = run_experiments(experiments={flow_params['exp_tag']: {'run': alg_run,'env': gym_name,'config': {
                **config
            },
            'checkpoint_freq': 20,
            'max_failures': 999,
            'stop': {'training_iteration': 20,},}
    })

Here are the errors I get:

Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-06-05_11-46-50_23722/logs.
Waiting for redis server at 127.0.0.1:52973 to respond...
Waiting for redis server at 127.0.0.1:65510 to respond...
Starting the Plasma object store with 13.471593267000001 GB memory using /dev/shm.

======================================================================
View the web UI at http://localhost:8889/notebooks/ray_ui.ipynb?token=4c9f25316d8d55134afb8e1d6b3d7f3c710f830bf1f93175
======================================================================

== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/3 CPUs, 0/1 GPUs
Memory usage on this node: 7.1/33.7 GB

Created LogSyncer for /home/lsi04/ray_results/green_wave/PPO_JavierPradoGridEnv-v0_0_2019-06-05_11-46-51ktobyge9 -> 
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 3/3 CPUs, 0/1 GPUs
Memory usage on this node: 7.1/33.7 GB
Result logdir: /home/lsi04/ray_results/green_wave
RUNNING trials:
 - PPO_JavierPradoGridEnv-v0_0: RUNNING

Error processing event.
Traceback (most recent call last):
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/trial_runner.py", line 261, in _process_events
    result = self.trial_executor.fetch_result(trial)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/ray_trial_executor.py", line 211, in fetch_result
    result = ray.get(trial_future[0])
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/worker.py", line 2386, in get
    raise value
ray.worker.RayTaskError: ray_PPOAgent:train() (pid=23764, host=lsi04)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/agents/agent.py", line 279, in train
    result = Trainable.train(self)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/trainable.py", line 146, in train
    result = self._train()
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/agents/ppo/ppo.py", line 101, in _train
    fetches = self.optimizer.step()
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/optimizers/multi_gpu_optimizer.py", line 125, in step
    self.num_envs_per_worker, self.train_batch_size)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/optimizers/rollout.py", line 28, in collect_samples
    next_sample = ray.get(fut_sample)
ray.worker.RayTaskError: ray_PolicyEvaluator:sample() (pid=23779, host=lsi04)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 52, in _wrapfunc
    return getattr(obj, method)(*args, **kwds)
AttributeError: 'list' object has no attribute 'reshape'

During handling of the above exception, another exception occurred:

ray_PolicyEvaluator:sample() (pid=23779, host=lsi04)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/evaluation/policy_evaluator.py", line 368, in sample
    batches = [self.input_reader.next()]
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/offline/input_reader.py", line 25, in next
    batches = [self.sampler.get_data()]
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 64, in get_data
    item = next(self.rollout_provider)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 267, in _env_runner
    preprocessors, obs_filters, unroll_length, pack, callbacks)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 346, in _process_observations
    policy_id).transform(raw_obs)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/models/preprocessors.py", line 161, in transform
    for (o, p) in zip(observation, self.preprocessors)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/models/preprocessors.py", line 161, in <listcomp>
    for (o, p) in zip(observation, self.preprocessors)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 257, in reshape
    return _wrapfunc(a, 'reshape', newshape, order=order)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 62, in _wrapfunc
    return _wrapit(obj, method, *args, **kwds)
  File "/home/lsi04/anaconda3/envs/flow/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 42, in _wrapit
    result = getattr(asarray(obj), method)(*args, **kwds)
ValueError: cannot reshape array of size 1 into shape (0,)

Worker ip unknown, skipping log sync for /home/lsi04/ray_results/green_wave/PPO_JavierPradoGridEnv-v0_0_2019-06-05_11-46-51ktobyge9
Attempting to recover trial state from last checkpoint.

I have been trying with differentes sumo networks files but I always get the first error and the " shape (0,) " in the second error.

Thanks for your time and regards.

AboudyKreidieh commented 5 years ago

hi @DMarcelAM I feel like the issue might have something to do with your definition of the observation_space or action_space variables in your environment. In order to help me reproduce your bug, would you be able to share these methods as well? Thank you!

DMarcelAM commented 5 years ago

Hi @AboudyKreidieh , the network I am importing is an intersection too, so I didn't change these methods from green_wave_env. Thanks for answer.

@property
    def action_space(self):
        """See class definition."""
        if self.discrete:
            return Discrete(2 ** self.num_traffic_lights)
        else:
            return Box(
                low=-1,
                high=1,
                shape=(self.num_traffic_lights,),
                dtype=np.float32)

    @property
    def observation_space(self):
        """See class definition."""
        speed = Box(
            low=0,
            high=1,
            shape=(self.scenario.vehicles.num_vehicles,),
            dtype=np.float32)
        dist_to_intersec = Box(
            low=0.,
            high=np.inf,
            shape=(self.scenario.vehicles.num_vehicles,),
            dtype=np.float32)
        edge_num = Box(
            low=0.,
            high=1,
            shape=(self.scenario.vehicles.num_vehicles,),
            dtype=np.float32)
        traffic_lights = Box(
            low=0.,
            high=1,
            shape=(3 * self.rows * self.cols,),
            dtype=np.float32)
        return Tuple((speed, dist_to_intersec, edge_num, traffic_lights))
AboudyKreidieh commented 5 years ago

Ah, I see. I believe that since you are using a template, and thereby not originating the network with a set number of vehicles, self.scenario.vehicles.num_vehicles is initialized to 0, resulting in the (0,) shape bug. You can consider replacing these variables with a fixed (non-zero) number and that should do the trick.

Let me know if it doesn't work and I'll be happy to iterate through this with you.

DMarcelAM commented 5 years ago

I was trying with different numbers but I'm obtaining similar errors, I set self.scenario.vehicles.num_vehicles = 1 I get ValueError: cannot reshape array of size 2 into shape (1,). I set self.scenario.vehicles.num_vehicles = 2 I get ValueError: cannot reshape array of size 1 into shape (2,). I set self.scenario.vehicles.num_vehicles = 3 I get ValueError: cannot reshape array of size 1 into shape (3,). When I set self.scenario.vehicles.num_vehicles = X I get ValueError: cannot reshape array of size 1 into shape (X,). except with X=1

DMarcelAM commented 5 years ago

I set num_vehicles=0 in green_wave example (tot_cars=0) and there's no error. But when I add inflows to green_wave_example I get the same error: ValueError: cannot reshape array of size 1 into shape (0,) By the way, when I don't use inflows and set num_vehicles=0 in my code it runs good. What can I do if I want to use inflows ?

AboudyKreidieh commented 5 years ago

I think the environment you are using may include a get_state method that doesn't take into account variability in the number of vehicles, hence the shape error when you turn on inflows. You can probably debug this by printing self.state.shape or len(self.state) and seeing if the number changes. If so, I wont recommend padding the observations (see flow/envs/merge.py as an example).

DMarcelAM commented 5 years ago

Hi @AboudyKreidieh and thanks for answer my questions, When you say that my get_state method should not take into account variability in the number of vehicles, how can I use the speed of each vehicle (the number of vehicles is still variable) as a state ?

eugenevinitsky commented 5 years ago

Hi @DMarcelAM, the tricky thing is that unless you use something special, a neural network takes in a fixed number of states as its input. If the number of states is variable, you either have to: (1) Figure out how to select a fixed number of vehicle states from the total number of vehicle states (2) Use a neural net architecture that can adapt to the number of vehicles in the system like a transformer (3) Convert the problem into some other format where the number of vehicle states is fixed.

So, for example, you could imagine you only include the states of the 10 vehicles closest to the intersection or something like that.

pengyuan-zhou commented 4 years ago

Hi @DMarcelAM, the tricky thing is that unless you use something special, a neural network takes in a fixed number of states as its input. If the number of states is variable, you either have to: (1) Figure out how to select a fixed number of vehicle states from the total number of vehicle states (2) Use a neural net architecture that can adapt to the number of vehicles in the system like a transformer (3) Convert the problem into some other format where the number of vehicle states is fixed.

So, for example, you could imagine you only include the states of the 10 vehicles closest to the intersection or something like that.

Hi, I got a similar error when testing flow_mappdg on singleagent_traffic_light_grid, the error is

~/git/flow/flow/envs/base.py", line 334 in step
 self.observed_ids.update(self.k.vehicle.get_ids())
AttributeError: 'list' object has no attribute 'update'

Letting it print out observed_ids and its type and it really is a list instead of a supposed dict. ['idm_3', 'idm_16', 'idm_10', 'idm_4', 'idm_12', 'idm_2', 'idm_0', 'idm_15', 'idm_11', 'idm_19', 'idm_9', 'idm_6', 'idm_7', 'idm_5', 'idm_8', 'idm_13', 'idm_18', 'idm_17', 'idm_14', 'idm_1']

EDIT: KeyError: 'idm_X' came up after manually converting _observedids and _observerd_rlids to "set", will keep looking into this, appreciate if you could give a test on light grid scripts :) @eugenevinitsky