Porting to gtk3 / Playing with a neural network

smearle commented 6 years ago

If anyone could lend a hand porting this to python3/gtk3, I've made considerable progress here: https://github.com/smearle/gym-micropolis/tree/master/micropolis-4bots-gtk3, at the service of an OpenAI gym environment enabling the training of city-building bots. There are just some glitches in the gui that need to be worked out.

These bots store their own internal representation of the game map (via the engine.getTile(x, y) function). The gtk3 port above, if initialized with a bot, (via pyMicropolis.gtkFrontend.main.train(bot)), will notify the bot of any human-player actions executed via the gui, so that it can update its own map-representation and respond accordingly. See below for a simple example of interactive inference with from early on in training: coal-a-boo

The networks I'm using are trained using actor-critic (A2C), and are made up entirely of convolutional layers (making them too cumbersome for ACKTR, since Kronecker factorization seems to need to jump through extra hoops to deal with convolutions (?)). Passing activations repeatedly through a single convolutional layer seems to result in performance that is at least on par with - and sometimes apparently better than - that of the same network, sans repeated passes. And finally, each layer of activation is the same width and height as the original input feature-map (where each 'pixel' of the image is a tile of the game map).

By adopting a convolution-only architecture, we pave the way for scalability to larger map sizes (since the action space is the map itsef, with build-actions for channels, a linear layer between it and the input would grow exponentially with map size). By using recursive (repeated) convolution, we allow distant activations on the feature-map to affect one another, again at minimal computational cost. And by making our hidden activations occur "on the map," we can think of our network as executing a kind of non-discrete Conway's game-of-life to determine its next build, or of letting activations flow spatially through the map itself, which has a certain appeal (though I can't help but feel that the notion of a compressed map, for more abstract, high-level spatial planning, might also be invaluable in this problem space...).

Another such network, a bit further along, without my bullying it: solid_railnres

If anyone has a spare gpu, I'm currently using a pytorch port of the OpenAI baselines repo for training: https://github.com/smearle/pytorch-baselines-micropolis. I'm very interested in exploring the possible space of neural architectures that lend themselves well to this environment, since it seems to foreshadow a more general game-playing paradigm which takes for input the gui-as-image, and outputs something like an equal-size gray-scale image, with pixel intensity corresponding to mouseclick-likelihood, for example.

In the current context, however, I want to make these agents as interactive as possible - the player should be able to make whatever builds they desire, and have the bot complement their design to optimize for some combination of factors (population, traffic, happiness, pollution, etc.). The player should be able to control these bots from the gui (stop/starting them, setting their optimization-goals), and confine their building area to subsections of the map. To this end, any help completing the port to gtk3 would be appreciated. Thanks :)

SimHacker commented 3 years ago

Holy shit!!! Mind=blown. I'm sorry I missed this first time around, but just discovered and read your wonderful paper: Using Fractal Neural Networks to Play SimCity 1 and Conway’s Game of Life at Variable Scales https://arxiv.org/pdf/2002.03896.pdf From your recent tweets it seems like you've made a lot of great progress: https://twitter.com/Smearle_RH/status/1391607674261938176 Any idea how much work it would take to get this to run on an M1 Mac with TensorFlow PluggableDevices? Or are you already doing that? https://developer.apple.com/metal/tensorflow-plugin/

SimHacker commented 3 years ago

The closest thing I came to putting AI into SimCity was integrating Joe Strout's and Jeff Epler's and Jez Higgins Python version of Eliza with the micropolis-online chat window. ;)

https://github.com/SimHacker/micropolis/blob/master/MicropolisCore/src/pyMicropolis/micropolisEngine/eliza.py

https://github.com/SimHacker/micropolis/blob/master/MicropolisCore/src/pyMicropolis/micropolisEngine/micropolisturbogearsengine.py#L1215

SimHacker commented 3 years ago

In gym-city/notes/notes, you asked: is there any point at which continuous strips of road become worthwhile? Or will the agent be able to effectively maximize population without roads? A: Zones get a bonus whenever the traffic simulation completes a road trip from a zone of one type to a zone of another type, so as long as you have clusters of r/c/i that are connected by roads, it will increase the growth rate. And successful trips generate traffic along the (random) path. But traffic generates pollution, and pollution affects land value, and land value affects growth rate, and growth rate affects population, and population effects traffic, and so on. The definitive guide to what the fuck SimCity is doing is Chaim Gingold's SimCity Reverse Diagrams: https://lively-web.org/users/Dan/uploads/SimCityReverseDiagrams.pdf He's deeply analyzed SimCity in his thesis on "Play Design": https://search.proquest.com/docview/1806122688 This is the "Make Traffic" page of the reverse diagram: Screen Shot 2021-06-09 at 14 59 30

SimHacker commented 3 years ago

And on the topic of traffic and dynamic routing and cellular automata, have you seen Dave Ackley's long term visionary work on the Moveable Feast Machine, and the amazing Distributed City Generation rule that his student Trent R. Small developed, "using bottom up distributed robust-first computing principles"? https://www.youtube.com/watch?v=XkSXERxucPc&ab_channel=DaveAckley

This paper explains how it dynamically routes traffic to its destination by flowing information about zones down sidewalks towards intersections! "Figure 2: Sidewalks contain a map of their city block distance from every building type." https://www.cs.unm.edu/~ackley/papers/paper_tsmall1_11_24.pdf

SimHacker commented 3 years ago

The Moveable Feast Machine is capable of a wide variety of behaviors, it's not just for generating cities, but especially good at randomly moving and diffusing particles around, and simulating biological processes. It's like a cellular automata, but different: designed to do many things traditional cellular automata can't do very easily or at all, like running robustly on the massively parallel but failure prone borg starships and gray goo nanobots of the future.

SimHacker commented 3 years ago

Robust-first computing: Demon Horde Sort (full version) https://www.youtube.com/watch?v=helScS3coAE&ab_channel=DaveAckley

Intercellular Transport in the Movable Feast Machine https://www.youtube.com/watch?v=6YucCpYCWpY&ab_channel=DanielCannon

Programming the Movable Feast Machine with λ-Codons https://www.youtube.com/watch?v=DauJ51CTIq8

Object Classification in the Movable Feast Machine https://www.youtube.com/watch?v=5iUeCk9dqvo&ab_channel=JoshuaDonckels

smearle commented 3 years ago

Thank you for this info and these links! The Reverse Diagrams are very helpful. I'm guilty of treating the game as a bit of a black box while throwing AI at it -- typical AI guy behavior. (Played a lot of SC4 growing up but started blindly hacking the reinforcement learning loop as soon as I found this code.) I wonder if the fact that I jacked the speed setting up quite aggressively is kind of eclipsing the usual growth rate benefit that would result from traffic.

I wasn't familiar with the Moveable Feast Machine, but wow -- these sent me down a Dave Ackley youtube rabbit hole. I will have to play around with the code. The Distributed City Growth rule in particular is fascinating. I've long dreamt of a differentiable, cellular-automaton-based city sim, and this seems quite close to the latter at least. (Not sure if the "differentiable" part is worth it -- I guess I was thinking "learnable/evolvable game mechanics!" but there may be other ways to go about that.)

Something my advisor and I have been discussing for a while is the possibility of training a diverse population of SimCity-playing agents. Each agent would be a little convolutional neural network (i.e. a neural cellular automaton), whose weights are evolved using quality-diversity methods to build different kinds of cities. The hope is that this would allow us to (somewhat) methodically explore the space of cities afforded by the game's mechanics and engage with its politics (and/or quirks). Is it possible to maintain high population without traffic (or employment)? To what extent can we get away with heavy pollution? What do the most profitable cities look like? etc.

Regarding M1 Macs and Tensorflow: I've run the code on a Mac, and a previous version used tensorflow, though I've since switched to a different RL training loop that uses pytorch. Still, this loop was originally a port of OpenAI's stable-baselines code (in tensorflow), so it shouldn't be too painful to make it compatible in theory. I just recently got an M1 Macbook. Working toward some deadlines next week, but after those I'll see if I can get something running on the Mac with GPU acceleration.

SimHacker commented 3 years ago

Those are good ideas about training agents to edit the map. It parallels my thoughts about how to refactor SimCity into a multi player game in which different players have different roles, like driving the bulldozer around, driving a road paving machine, driving house building machines along the roads, and programming robots to help you build and maintain your city. You could recast the city editing tools as "vehicles" or agents (like the helicopter that finds heavy traffic or puts out fires or chases criminals, or a train that delivers building supplies for other robots to deploy), and both players and AIs could learn to drive them around (using a more atari-like interface, that's easier for an ai to learn and much more constrained and focused that full-powered free-form map editing).

Robot Odyssey meets SimCity! See Alan Kay's comments on Robot Odyssey:

https://github.com/SimHacker/micropolis/blob/master/turbogears/micropolis/htdocs/static/html/alankay.html

Here are some links related to those ideas:

This is a summary I wrote of one of Will's talks that you might find interesting:

Will Wright on Designing User Interfaces to Simulation Games (1996) A summary of Will Wright’s talk to Terry Winnograd’s User Interface Class at Stanford, written in 1996 by Don Hopkins, before they worked together on The Sims at Maxis.

https://donhopkins.medium.com/designing-user-interfaces-to-simulation-games-bd7a9d81e62d

And here's an illustrated transcript and a video of a talk I gave about applying ideas from Constructionist Education to Micropolis:

Micropolis: Constructionist Educational Open Source SimCity Illustrated and edited transcript of the YouTube video playlist: HAR 2009: Lightning talks Friday. Videos of the talk at the end.

https://donhopkins.medium.com/har-2009-lightning-talk-transcript-constructionist-educational-open-source-simcity-by-don-3a9e010bf305

Here's a video demo of the Python web server / Flash browser client based Micropolis, that shows the plug-in PacMan agent and Church of Pacmania zone scripted in Python:

https://www.youtube.com/watch?v=8snnqQSI0GE&ab_channel=DonHopkins

Here's an outline I wrote about the stuff I did and stuff I wanted to do, to make a multi player Python scriptable SimCity/Micropolis using AMF and OpenLaszlo. At the end there are some "Future Plans" that touch on applying the ideas of Constructionist Education to teach language and writing and group collaboration skills! Like using collaborative SimCity as a communication medium and discussion tool for kids to help each other learn by writing proposals, discussing, campaigning, voting on, implementing, and journalising them.

https://github.com/SimHacker/micropolis/blob/master/PROGRESS.txt

Future plans.

Shared city library.
Journal, chat, IRC.
Deeper MediaWiki integration.
Multi user support.
    Avatar chat in game.
    Avatars as editing tools and programmable bots.
    Writing proposals.
    Campaigning for issues.
    Voting on proposals.
    Cooperative multi user interface.
    Writing down ideas, justifying your proposals to other players, and getting others to cooperate.
    Journalism and creative writing.
    City newspaper.
    Publish stories about cities in the wiki.
    Live playable views of save files associated with stories.
Facebook interface.
PayPal interface for micropayments to buy virtual money, cheats, high speed simulation, etc.
Voice and video conferencing.
Video playback for tutorials, reporting and education.
Amazon Web Services Elastic Computing Support.
    Load balancing.
    Clustering.
Online community.
    Sharing content.
    Rating. 
    Reporting.
    Discussion groups.

https://github.com/SimHacker/micropolis/blob/master/laszlo/micropolis/TODO.txt

Saving and Structuring Cities and Scenarios

Each city has a "parent" field pointing to the city that it was loaded from.

Are city save files mutable? Ideally, no, so parent links will point to the city at the state it was saved. But pragmatically we want to be able to save the state of an ongoing simulation as a checkpoint efficiently. When do you branch off a new save file? If there are any children pointing to the current city as their parent, then you should branch off a new city to save the current city state. So there will be immutable "branch" cities that have children, and mutable "leaf" cities that are actively simulating.

Each user is initially allowed one mutable "leaf" city. They can generate a new terrain, load a scenario, or load any of their cities or other user's shared cities into it. They can simulate it, and during the simulation, the state is regularlarly saved into the mutable city. They can save their mutable city into a new immutable city, and share it if they like, so any users can load, play and branch off from it. The city saves are arranged in a tree, and users can branch off of other users' cities. The user interface lets you branch off of a user's active mutable city, but behind the scenes it clones the current state of the mutable city into a new immutable snapshot, branches off from that (by loading it into the user's current mutable city).

Unify the identification of scenarios, anonymous generated cities, and saved cities. Make an immutable city for each scenario. Add a field to specify which scenario preparation function is run when the city is loaded (or none if empty). Generalize that for scripting plug-in scenarios in both python and xml. An xml file will describe and program the scenario (some dialogs to show, some variables to configure, some python code to run at start-up, etc). Save scenarios as separate objects from cities, and associate an ordered list of scenarios with city save files, so you can make general purpose layered scenarios, and plug any city into any scenario. Scenarios can be layered to dynamically generate terrains of different types, set up initial conditions for lessons and experiments, show dialogs and inject events into the game, detect success and failure state, report progress, add user interface elements, tools, zones, agents, etc.

Multi Player

Chat interface.

Language translation. Enters select their language from a menu. Server calls Google Translate to translate what they say to different langauge of other players in the chat.

Chat visibility. Who can see you, your messages, your cities?

Channels. IRC-like channels for various conversation topics.

Visual/video chat. Scrolling chat message window. Messages show the Facebook profile image or avatar of the user. Messages can include URLs to link to external web pages. Messages can include images to show. Messages can include videos to show. Messages can show video feeds of other users. Click on a chat message to puff it up into the notice window. Notice window has a history of messages that have been displayed in it, so the user can go back to previous messages. Notice views of messages can include full sized pictures, links and controls. Open up the user's avatar to see and load their current and saved cities. A flat list of one user's cities, or tree browser view (linked into other users' cities). Notice window can include user interface dialogs. Answer a yes/no or multiple choice question. Users can vote on issues that effect each others' simulations. If you want to raise taxes, you have to write an explanation, and get other users (who are playing their own cities) to vote for it. You can share messages on your facebook page, including links to vote or answer questions.

Virtual friends. Online documentation, tutorials, tips and advice, and psychiatric counseling in the form of agents like Elize, Alice, Currently has an Elize bot. Integrate ALICE bots running remotely. http://www.pandorabots.com/botmaster/en/home

Speech synthesis.

Connect games to real-world problems.

Write a letter to your congressman from within the game, about a particular issue that you learned about and played around with in the game.

https://github.com/SimHacker/micropolis/blob/master/MicropolisCore/src/MicropolisEngine/doc/PLAN.txt

TODO for Education:

Educational.
  Bring old Micropolis courseware up to date, and integrate with the game. 
  Export simulation data to spreadsheet or xml. 
  Creative writing, storytelling, newspaper reporting, blogging, etc.
  Scenarios and experiments.
  What-if? 
    Branching history at decision point, and comparing different results. 
  Scripting. 
    Open up the simulator to Python. 
    Web services to remotely monitor and control simulation. 
    HTML or AJAX web server remote control interface.
      Support multi player interactions through web server.
        Submit a proposal to build a stadium over the web.
        Style it like a real civic government web page, that allows citizens to participate online. 
    Enable extending the graphics, tiles, sprites. 
    Enable programming behaviors, defining new zones, new global variables, overlays, etc. 
    Cellular automata.
    Visual programming.
    Programming by example. 
    KidSim, AgentSheets. 
    Robot Odyssey.

TODO for Programming:

Visual Programming

  Simplify the Micropolis interface and make it easier for kids to
  use it with the game controller, in a way that will support
  multi player interaction.

  Collapse the separate concepts of game editing tool (bulldozer,
  road, residential zone, etc) and agent (sprites like the
  monster, tornado, helicopter, train, etc).

  Agents with specialized tool represent different roles that kids
  can play. A bunch of kids can join together and play different
  roles at the same time in the same city. Instead of having a
  bunch of editing tools to switch between, you have a bunch of
  different agents you can drive around the map, like using a
  monster to crush things instead of a bulldozer, or riding around
  in a helicopter to scroll around and observe the map. Make a
  meta-game like pokemon trading cards or magic the gathering,
  about acquiring and deploying and using agents on the map. Give
  agents different budgets and constraints.

  Use an agent to represent a user in the world, and control an
  editing tool. You see other users in the map driving around
  their editing tool agents.

  Each editing tool can be associated with a particular agent,
  with a keyboard/game controller based user interface for moving
  around, as well as a mouse based interface for picking it up and
  dragging it around.

  The road tool becomes a road building vehicle, that you can
  easily move up/down/left/right/diagonally with the game
  controller directional input. Requires much less coordination to
  draw straight roads than with a mouse. 

  The bulldozer tool becomes an actual bulldozer that you can
  drive around the map, crushing things in your wake.

  This makes the game easily usable by little kids in book mode. 

  Also support small children using Micropolis like a drawing tool or
  etch-a-sketch, simply doodling with the editing tools for the
  visceral pleasure of it, and setting fires and other disasters
  to watch it burn and mutate.

  Logo Turtles (as a generalization of the monster, tornado,
  helicopter, etc)

    Implement programmable logo turtles as agents that can move
    around on the map, sense it, and edit it. 

    Like Robot Odyssey agents, so you can go "inside" an agent,
    and travel around with it, operate its controls, read its
    sensors, and automate its behvior by wiring up visual programs
    with logic and math and nested "ic chip" components.

    Plug in graphics to represent the agent: use classic logo
    turtle and Micropolis sprites, but also allow kids to plug in
    their own.
      Micropolis sprites have 8 rotations. 
      SVG or Cairo drawings can be rotated continuously.

    Re-implement the classic Micropolis agents like the monster,
    tornado, helicopter, train, etc in terms of logo turtles, that
    kids can drive around, learn to use, open up and modify (by
    turning internal tuning knobs, or even rewiring).

    Let kids reprogram the agents to do all kinds of other stuff.

    Mobile robots, that you can double click to open up into
    Robot-Odyssey-esque visual program editors.

    Agents have local cellular-automata-like sensors to read
    information about the current and surrounding tiles.

    KidSim / Cocoa / StageCraft Creator let kids define visual
    cellular automata rules by example, based on tile patterns and
    rules. Show it a pattern that you want to match by selecting
    an instance of that pattern in the world, then abstract it
    with wildcards if necessary, then demonstrate the result you
    want it to change the cell to in the next generation.

    Sense high level information about zones and overlays, so the
    agents can base their behavior on any aspect of the world
    model.

      Support an extensible model by allowing users to add more
      layers.

        Add layers with arbitrary names and data types at
        different resolutions:

          byte, int, float, n-dimensional vector, color, boolean
          mask, musical note, dict, parametric field (i.e. perlin
          noise or other mathematical function) at each cell, etc.

    Edit the world. 

      All Micropolis editing tools (including colored pens that draw
      on overlays) should be available to the agent.

      Enable users to plug in their own editing tools, that they
      can use themselves with the mouse, keyboard or game
      controller, or program agents to use to edit the map under
      program control.

  Robot Odyssey

    Build your own universal programmable editing tool.
    Roll your own von Neuman Universal Constructor. 
    Smart robots you program to perform special purpose editing tasks. 

    The "Painter" picture editing program had a way of recording
    and playing back high level editing commands, relative to the
    current cursor position.

    Remixing. Journaling. Programming by demonstration or example.
    You could use a tape recorder to record a bunch of Micropolis
    editing commands that you act out (or you can just select them
    from the journal), then you can play those tapes back with
    relative coordinates, so they apply relative to where the
    agent currently is on the map. You can copy and paste and cut
    and splice any editing commands into tapes that you can use to
    program the robot to play back in arbitrary sequences. 

    Program an urban sprawl development-bot to lay out entire
    residential subdivisions, complete with zones, roads, parks and
    wires. Then program a luddite roomba-bot that sucks them all
    up and plants trees in their place.

    This becomes really fun when we let players plug in their own
    programmed zones for the robot to lay out, and layers of data
    to control the robot's behavior, out of which they can program
    their own cellular automata rules and games (like KidSim /
    Cocoa / StageCraft Creator).

SimHacker / micropolis

Porting to gtk3 / Playing with a neural network #86