smearle / gym-city

An interface with micropolis for city-building agents, packaged as an OpenAI gym environment
MIT License
141 stars 18 forks source link

Microplois code uses python2 base and gym_micropolis in python3 and some conflicts #4

Closed manojankk closed 3 years ago

manojankk commented 5 years ago

I have tried above code in Linux Centos 7(64 bit version) with gi package( I have changed to pgi ) is not compatible with python3 and some Kernel error .

manojankk commented 5 years ago

(test.py:27889): CRITICAL : 05:43:09.256: g_function_info_get_flags: assertion 'GI_IS_FUNCTION_INFO (info)' failed

(test.py:27889): CRITICAL : 05:43:09.256: g_callable_info_get_n_args: assertion 'GI_IS_CALLABLE_INFO (info)' failed

(test.py:27889): CRITICAL : 05:43:09.256: g_callable_info_get_return_type: assertion 'GI_IS_CALLABLE_INFO (info)' failed

(test.py:27889): CRITICAL : 05:43:09.256: g_type_info_get_tag: assertion 'info != NULL' failed

smearle commented 5 years ago

Thanks for your interest! These look like they're caused while trying to render the game window using gtk. Have you tried running without the --render option? The more details the better.

Could you also explain what you've written in the title a little bit? I used to be only able to render the GUI in python2 but I've since updated so that everything should run in python3. However, some of the gtk stuff in micropolis/MicropolisCore still needs to be ported from gtk3 to gtk2 (e.g., gui renders with some missing pie menu icons).

manojankk commented 5 years ago

I tried below steps

  1. I could not make install as Swig3.0 compile issue, so tried directly by running python code
  2. Used venv with python3, Installed cairo, goobject, pygtk .... Unable to import gi library in python3, installed pgi and changed in all below python files \gym-micropolis-master\micropolis\MicropolisCore\src\pyMicropolis\micropolisEngine*.py

changed import gi gi.require_version('Gtk', '3.0') to import pgi pgi.require_version('Gtk', '3.0') from pgi.repository import Gtk

  1. ran below code python3 main.py --log-dir trained_models/acktr --algo acktr --model squeeze --num-process 24 --map-width 27 --render It is running and creating some 23(did not counted exactly) blank windows and stays as it Creating some excel files too. server I have used - Centos 7(64-bit) with GPU support. ( More details I will update soon )

Kindly point if I am fundamentally wrong some where.

smearle commented 5 years ago

It sounds like you might actually be successfully running a training session, but that the pgi hack is resulting in all agents (rather than just the first) being rendered. Try without the --render option, and look for an entropy/reward/loss etc. printout in the console.

What went wrong with the gi import? Might be the only thing really standing in your way. I had a messy time installing that library as well.

manojankk commented 5 years ago

Fixed gi import issue. cloned the latest code. Ran main.py.

  1. Invalid argument - num_actions - fixed
  2. CNN IN/OUT tensor not matching

Traceback (most recent call last): File "main.py", line 374, in main() File "main.py", line 150, in main icm_enabled=args.curiosity) File "/.../gym-micropolis/model.py", line 81, in act value, actor_features, rnn_hxs = self.base(inputs, rnn_hxs, masks) File "/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/.../gym-micropolis/model.py", line 681, in forward x = torch.cat((x, x_i), 1) RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 13 and 12 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:83

smearle commented 5 years ago

That's good news. I've just fixed a line of code in the squeeze model which made an assumption about the map-width, so try updating your copy of the repo and trying again.

You might also want to try running with '--model fractal' and '--n-recs 3'. Or, with '--model fixed'. The "model" option determines the structure of the neural network we are attempting to train. I believe the fractal neural architecture is the quickest and most effective learner of the task thus far.

manojankk commented 5 years ago

Thanks. I fixed one minor bug and now fractal model is running....No idea how long it takes.... CSV is generating with 2 header string lines(1 comment and i header column) so I added one more f.readline() in visualize.py ( in load_data method )

Last console output ( still running.........) Updates 17530, num timesteps 2103720, FPS 323 Last 10 training episodes: mean/median reward 1072.4/685.9, min/max reward 2.8/2863.4 dist entropy 0.4, val/act loss 1.4/-0.1

smearle commented 5 years ago

Nice. Here is the paper on which I based the fractal implementation, if you're curious: https://arxiv.org/abs/1605.07648. You can also train with --drop-path, and --eval-interval 100, for ex. (to evaluate, log and visualize the deterministic performace of the entire network and of the individual fractal 'columns' - see below) and run inference on particular subnetworks (columns from 0 through [n-recs - 1]) using python3 enjoy.py --load-dir some/directory --active-col 2 --map-width 16.

Training runs for 10M frames by default (control this with the --num-frames option). But if you're curious, you can run inference using enjoy.py --render and having it load the weights you're currently training (and you can run this on a variety of map-sizes, and with user interaction, to see how well the model is generalizing). Given the reward you're already seeing in the printout, your model's probably learnt how to make basic swaths of low-density residential, or something like that, but have a look.

What follows is some unsolicited background info, in case you want to wade into model.py yourself.

As for the motivation behind the fractal network: I was finding that using a single repeated ('recursive') convolution (as in the 'fixed' model) yielded good results (e.g., the gif in the readme) - so long as it had a skip connection. Each column of the fractal model is exactly such a repeated convolution (with each column comprising a different number of recursions, by factors of 2), though they intertwine with one another at regular intervals in the overall model; these built-in skip connections, as it were, seem to improve speed and stability of learning, upon somewhat cursory comparison.

Meanwhile, I've been trying to conceive a way to extend this notion of recursive convolution to a model that produces a compressed representation of the map (the 'squeeze' model), so that we could scale to really big map-sizes more easily (it also seems desirable for the agent to be able to plan at the 'neighbourhood' or 'quadrant' scale, though, interestingly, even models that never shrink the map size exhibit certain seemingly high-level behaviour (i.e., neighbourhoods of particular zone-types), especially when repeated convolution allows distant tiles to affect one another during a single forward pass). If we take the fractal model, and, instead of adding a single fixed-size convolution with each recurrence, as in the paper, add a sequence of repeated convolutions which squeeze and unsqueeze the map to/from some fraction of its original size (using strided convolutions, for example), then each column of the fractal network is such a squeeze network (or multiple such squeeze networks stacked sequentially, if we're dealing with one of the internal columns).

Though such 'fractal squeeze' models involves less direct skip connections, their overall performance seems to be on par with that of regular fractal models, though individual squeeze columns seem to fare better than their non-squeeze counterparts upon deterministic evaluation - which makes sense, given that the non-squeeze column [n-recs - 1] ('leftmost' in the figure in the paper), for example, only allows adjacent tiles to affect one another (since it comprises only a single convolution), whereas a squeeze-unsqueeze sequence can have potentially map-wide range, without either too many additional parameters or recursions over a too-large activation space.

Anyway, enjoy the repo and be in touch with questions or observations. I'm working on a similar gym wrapper for OpenTTD and hope to have that up and running soon as well.

manojankk commented 5 years ago

Thanks for a detailed explanation. I have trained fractal model and tried enjoy.py with --render but it builds only 1 block and stops.

BASE NETWORK: MicropolisBase_fixed( (skip_compress): Conv2d(32, 15, kernel_size=(1, 1), stride=(1, 1)) (conv_0): Conv2d(32, 64, kernel_size=(1, 1), stride=(1, 1)) (conv_1): Conv2d(64, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2)) (conv_2_0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (critic_compress): Conv2d(79, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (critic_downsize_0): Conv2d(64, 64, kernel_size=(2, 2), stride=(2, 2)) (actor_compress): Conv2d(79, 19, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (critic_conv): Conv2d(64, 1, kernel_size=(1, 1), stride=(1, 1)) ) /home/nkolli/simcity/projects/manoj36/gym-micropolis/micropolis/MicropolisCore/src/images/tileEngine/tiles.png /home/nkolli/simcity/projects/manoj36/gym-micropolis/micropolis/MicropolisCore/src/images/tileEngine/tiles.png /home/nkolli/simcity/projects/manoj36/gym-micropolis/micropolis/MicropolisCore/src/images/tileEngine/tiles.png

it print 3 times the tiles.png and stops there. Any clue ? Also funds are not reducing as it blocks. some thing is wrong.

smearle commented 5 years ago

Judging by the model printout, it looks like the enjoy.py script is loading up the wrong model to run inference on. Run enjoy with the same model-related arguments that were passed during training. So —model fractal and —n-recs 3 in your case, I believe.

^^Actually it should load up the right model automatically during inference, despite the printout. I'm finding that acktr is unable to properly train the fractal model, and crashes to a rewardless state. If your training logs/graphs show such a crash, then that's why your trained agent is failing. Next time train with --algo a2c.

manojankk commented 5 years ago

As mentioned I ran with a2c with fractal for 5M,10M,20M steps. But none of them build the entire city. It prints only one block and stays there. I tried debugging the enjoy.py, but no luck so far.

python enjoy.py --algo a2c --model fractal --n-recs 3 --num-process 24 --map-width 27 --load-dir ./gym-micropolis/trained_models/a2c/fractal/w27/5M/20190209132608 --render

Updates 41660, num timesteps 4999320, FPS 674 Last 10 training episodes: mean/median reward 11717.0/10134.7, min/max reward 7990.4/18772.5 dist entropy 7.0, val/act loss 0.7/0.3

Looks like it gets best rewards.

Last console BASE NETWORK: MicropolisBase_fractal( (conv_00): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1)) (fixed_0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (fixed_1): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (fixed_2): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (join_compress): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (compress): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (critic_squeeze): Conv2d(32, 32, kernel_size=(2, 2), stride=(2, 2)) (actor_out): Conv2d(32, 19, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (critic_out): Conv2d(32, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) )

Any thoughts ? Sure I have used same model and algo and path is correct to pick env.pt file

smearle commented 5 years ago

That's very odd. Are you running on the same --map-width as in training? But then again, trained models can normally generalize pretty well to larger map sizes. Try making some builds and seeing if the agent responds to them (right-click to bring up pie menu (sorry the icons are missing)). And yes, be absolutely sure you're pointing to the right --load-dir, preferably by dragging the folder onto the console. Inference is working fine on my end, even with different map-sizes, and even passing superfluous arguments (--num-process --algo --model --n-recs should not change anything during inference; an ambiguity that should be fixed).

In the interim, now that the gi import situation is fixed, you can render while training, and should be able to see the agent doing comprehensible things as it begins to achieve reward. Besides, I'm pretty certain rendering during training brings little overhead. You might also want to increase --num-agents to potentially increase your FPS.

manojankk commented 5 years ago

Thanks. Using pie menu I tried building some blocks, but succeeded only in roads,rail,park etc and residential,commercial etc not building. Also while clicking map --> Residential Zones/Commercial Zones/etc.. (If I change the zone) it throws a "Segmentation fault (core dumped)" error. some C/python code failes. Not sure this is gtk library issue.
Added below snippets wherever applicable to avoid gdk and gobject bug. import gi gi.require_version('Gtk', '3.0') gi.require_version('PangoCairo', '1.0') from gi.repository import GObject