Mindwerks / worldengine

World generator using simulation of plates, rain shadow, erosion, etc.
MIT License
982 stars 128 forks source link

Make use of Numpy-Arrays #138

Closed tcld closed 8 years ago

tcld commented 8 years ago

Another branch, as promised (https://github.com/Mindwerks/worldengine/issues/136). I replaced almost all arrays with numpy-arrays. Everything that is not mentioned in the serialization-tests is probably not touched, either. (I had to start somewhere.)

Here some performance-test results:

Parameters:

  width        = 384
  height       = 256
  plates       =   5
  oceansurface = 65% (only effective in the numpy_repl-branch, which is based on ocean_level_fix)
  seed         = 13047

Generation times for every branch:

  master     = 39.57 s
  perf_opt   = 29.22 s
  numpy_repl = 28.40 s
  (numpy_repl + platec1.4.0 = 21.48 s)
  (numpy_repl + platec1.4.0+ small improvements = 20.44 s)

Images (master - perf_opt - numpy_repl - platec1.4.0): seed_13047_biome seed_13047_biome seed_13047_biome seed_13047_biome

The latest changes were not supposed to speed things up too much, only some of the excessively used functions were replaced. I did what I could while keeping an eye on the target of getting numpy ready. In the future it won't be a pain to get a bit of optimization done anymore. And maybe the memory footprint will have changed a bit, I still have no way of reliably checking that. I will do some tests with larger maps and see if anything changed there. (And I am really looking forward to the new version of platec. :) )

All tests except two are passed, although I did run one of the regeneration-scripts in the tests-folder to make it happen. At this point, especially due to the change to the seeding, worldengine-data should be updated a bit anyway, I think. (One of the failed tests, maybe even both, can probably be passed with fresh data - maybe it has to be Python 2/3-specific, though.)

EDIT: I took a look at the new platec version and added the time above. In total we are down to roughly 50% of the generation time from a couple of days ago! :)

EDIT2: As far as I see, the failed test comes from the fact that the old world-file was generated without numpy-arrays. I wouldn't know what to do about that - on my system regenerating the test-world was enough, though.

EDIT3: I did a little more profiling this evening. The air gets thin now, there are only three methods left that slightly justify spending time on with optimization (world.is_land()~11% of total time, world.tiles_around()~20%, common.anti_alias_point()~20% - that's roughly 50% in total!). And after looking at them for a while and trying some things I really couldn't come up with any improvement. A funny sidenote is that the function calls for my testmap went done from more than 300 million to less than 100 million. Half of which is attributed to is_land(). Speed-wise this might be the end of the road, unless a lot more work is invested.

         95833445 function calls (95705636 primitive calls) in 92.104 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  4014080   19.543    0.000   19.543    0.000 worldengine/../worldengine/common.py:97(anti_alias_point)
  1262482   18.308    0.000   27.963    0.000 worldengine/../worldengine/world.py:375(tiles_around)
     1150   14.349    0.012   14.349    0.012 {built-in method step} -- this refers to platecs step()-method
 56451705   10.417    0.000   10.417    0.000 worldengine/../worldengine/world.py:341(is_land)
        1    2.312    2.312    4.155    4.155 worldengine/../worldengine/simulations/erosion.py:136(river_sources)
psi29a commented 8 years ago

Doing a rebuilt based on new master.

tcld commented 8 years ago

As said here https://github.com/Mindwerks/worldengine/pull/134 , I think that one of those last two commits (well, I hope it is one of those) might have introduced a problem. Once the results of a currently running generation are done, I will post an image, should anything go wrong. (Or just a message, if everything is ok.)

tcld commented 8 years ago

Update on this: I separated this branch from the ocean_level_fix-branch (https://github.com/Mindwerks/worldengine/pull/134). This branch passed all minus two tests for me (after running tests/data/data_generator.py as far as I remember) and it still produces exactly the same output as the master-branch. And it only needs about 75% of the time, I have to add. ;)

The ocean_level-branch will have to wait for a bit of discussion with you guys since the output might not be desirable. But that can be fixed, after some input. :)

@psi29a What do I have to do to update whatever tests/data/data_generator.py spit at me? This branch will not pass the tests without it, I think. (There is always complaining about a numpy-method that is missing.)

tcld commented 8 years ago

I added a ton of comments since these commits change quite a bit. Hopefully I didn't make anything unnecessarily cryptic.

psi29a commented 8 years ago
  File "/home/travis/build/Mindwerks/worldengine/worldengine/drawing_functions.py", line 101, in _find_land_borders
    if world.ocean[int(y / factor), int(x / factor)]:
TypeError: list indices must be integers, not tuple

This looks to me that ocean isn't numpy...

psi29a commented 8 years ago
  File "/home/travis/build/Mindwerks/worldengine/worldengine/world.py", line 483, in elevation_at
    return self.elevation['data'].T[pos]
AttributeError: 'list' object has no attribute 'T'

Is this a typo?

tcld commented 8 years ago

No. "T" is a call to the transpose-function of the numpy array. http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.T.html

The error only occurs when an old world is used. However, I can replace the call to T for something more clear, it doesn't gain any performance anyway. (That's why I tried it.)

tcld commented 8 years ago

It is "fixed" now. I should have done that earlier.

tcld commented 8 years ago

I didn't notice this before:

This looks to me that ocean isn't numpy...

I have not really understood why but that is the same problem as with the missing transpose() method: Somehow when loading an old map the arrays are not (yet?) numpy arrays. Did I overlook a place where the arrays could get into the program without being converted? Is this due to the pickle-format? After regenerating the test-world this worked on my local system. Not understanding why, though, is a problem, of course.

tcld commented 8 years ago

I think the problems come from this: https://docs.python.org/2/library/pickle.html#pickle.load

I assume the test-world is a pickle-world and things go like this:

  1. https://github.com/tcld/worldengine/blob/numpy_repl/worldengine/cli/main.py#L481
  2. https://github.com/tcld/worldengine/blob/numpy_repl/worldengine/cli/main.py#L208
  3. https://github.com/tcld/worldengine/blob/numpy_repl/worldengine/world.py#L50

The Protobuf-loaders could be manipulated in World.py, but the old pickle-save does not contain numpy-arrays and thus after loading the arrays cannot be used as such.

If this sounds feasible, we should really regenerate that world and see how the two tests do that so far have been failing (for me, at least).

psi29a commented 8 years ago

That was my first thought as well... loading old datafiles will have data as array and not numpy array.

This means we'll likely have to regenerate the world files to accept the new format. :)

We'll be breaking compatibility with older world files as a result. This is likely the case for protobuf as well.

This is why we have unit and functional tests, to help us stop and think about the problem before a user is exposed to it. :)

tcld commented 8 years ago

All the arrays are converted after being loaded respectively before being saved with protobuf. I expect old protobuf saves to still work; I don't have one to test it, though.

I never made the biome-map use protobuf (but I think that is the only one, it just looked like quite a bit of work). So if the world were to be regenerated...at some point another break of compatibility would have to occur, I fear.

tcld commented 8 years ago

I think I might at least put very basic code in there to have the biome-map be a numpy-array (right now), without the usual small optimisations that follow all over the code. That way we don't have to break pickle-compatibility twice. (By the way: If protobuf were to be the favourite serializer, maybe the test-worldfile should be in protobuf-format? That way we wouldn't need two separate files for Python 2 and 3, either.)

Or is this kind of break maybe too much of a problem and the switch to numpy not feasible? I would hate to see that.

psi29a commented 8 years ago

I agree totally...

@ftomassetti I think we're getting to that point where we need to pull the trigger on axing pickle. ;)

psi29a commented 8 years ago

In worldsynth, I used pytables which created a sort of SQL layer above HDF5. It supported inline compression and other nifty stuff. This is always an option btw...

tcld commented 8 years ago

Sounds great as an option for the future. :D Right now I want to get my branches compatible. I continue building on top of them and that feels like a bad idea when the base isn't 100% solid yet.

psi29a commented 8 years ago

Again, I agree. :)

tcld commented 8 years ago

This went surprisingly quickly. This branch should now replace all major arrays with numpy arrays. The next pull request https://github.com/Mindwerks/worldengine/pull/141 will take care of smaller ones that are spread around the code. To pass the tests it will still need a freshly generated world (generated with this branch, right? So maybe I should do that?), but it passed all tests that I threw in its way.

ftomassetti commented 8 years ago

Dropping pickle is fine for me: I just started using it because it was the easiest thing to use in the beginning. I am all in favor of formats which are easily reusable from other languages, that was my goal with protobuf. If protobuf does not convince you I am absolutely open to alternatives.

psi29a commented 8 years ago

Before we merge this in, can you try fixing the failing test?

Fork https://github.com/Mindwerks/worldengine-data and send a PR with a world file using the same seed number.

Once this is merged, we can retry the build here and merge. :)

tcld commented 8 years ago

So...I fork worldengine-data, add a freshly generated world (Using worldengine/tests/data/data_generator.py ?)

There are three worlds: https://github.com/tcld/worldengine-data/tree/master/tests/data

Why are there three? Which one will I replace?

psi29a commented 8 years ago

I don't know, that is why I asked to replace the one with the failing seed... you can specify the seed number running WE manually. The resulting world file should be good enough right?

    def test_draw_ancient_map_factor1(self):
        w_large = World.from_pickle_file("%s/seed_48956.world" % self.tests_data_dir)
        target = PixelCollector(w_large.width, w_large.height)
        draw_ancientmap(w_large, target, resize_factor=1)
        self._assert_img_equal("ancientmap_48956", target)

That would be the seed_48956.world (just use 48956 as your seed), make sure to ask for an ancient_map as output. Commit those two, and that should fix one test. :)

tcld commented 8 years ago

Which parameters do I use for that world?

Maybe that doesn't matter?

How about replacing all the different seeds with one to make things simpler? :P

psi29a commented 8 years ago

Create a new one, just specify that seed.

You can then run we again, this time with ancientmap option and use the resulting world file from above.

tcld commented 8 years ago

I won't touch the images at all, none of them is named in a way that makes me sure if I should replace it or not. If they were supposed to, they would have their standard names so I could easily overwrite them...right?

tcld commented 8 years ago

Of course I touched all the images. And I think I updated everything that was possible - pickle-save (and the one corresponding image) excluded. https://github.com/Mindwerks/worldengine-data/pull/1

As I stated in that pull request: There recently was a change in the seeding mechanism so the blessed images probably needed to be regenerated anyway. But I am still not entirely sure how they work and how they are defined as "blessed". So if regenerating them was a bad idea, I need more detailed instructions, please.

(I really hope this all works out and this complicated branch will be merged. It is giving me headaches that I cannot really touch any of the other code before the numpy-stuff is out of the way.)

EDIT: And there are the pushes corresponding to your comments. Time to sleep.^^

tcld commented 8 years ago

Another commit. Now the only thing that is open to debate is this: https://github.com/Mindwerks/worldengine/pull/138/files#diff-6370c3dd8fe78487da3ac1540b8d3b4cR124

I would suggest leaving it in. There are quite a lot of actually dead functions (maybe a cleaning pass should be underdone), this one at least provides the World class with a working equality operator.

tcld commented 8 years ago

I could use some help right now: I generated a full set of data, including that for Python 2. Under Python 3 all tests pass, under Python 2 the ancient-map tests don't pass (that is 2/40). Did I do something wrong while generating? I don't know anything about the ancient maps yet, I thought they are something that is generated after the actual world is already finished and saved. Is that wrong?

(Once this is sorted out, I can push all the changes and this should be done.)

PS: So far I haven't touched the code for the ancient maps. I really don't know what's going on. Is there maybe another rogue RNG seed at work?

tcld commented 8 years ago

Here the coloured version - Python 3 first, Python 2 last:

ancientmap_28070_factor3ancientmap_28070_factor3 The mountains in the middle are oriented differently. (protobuf)

ancientmap_48956ancientmap_48956 These two are very different. :( (pickle)

tcld commented 8 years ago

I will take another look at this tomorrow morning. Maybe I made some mistake that happens to make Python 2 and 3 create different outputs.

EDIT: Maybe there is a chance that this is a legitimate bug? I don't know how well Python 2/3 compatibility had been tested before.

EDIT2: All other output-images seem to be fine. And since my Python 3 version passed the test, it couldn't be randomness introduced during the ancient-map generation. Are there any preparatory steps during world generation that could influence the outcome?

tcld commented 8 years ago

I think I found the problem.

Python 3.4.3 (default, Oct 14 2015, 20:28:29) 
[GCC 4.8.4] on linux
>>> import random
>>> random.seed(0)
>>> random.randint(0, 1000000)
885440
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2
>>> import random
>>> random.seed(0)
>>> random.randint(0, 1000000)
844422

Maybe we need a new random module. And a new noise module? :(

For now I guess I will push all my changes and then we regard this is a problem to be fixed. In principle it concerns all drawing functions that make use of the random module.

ftomassetti commented 8 years ago

I am wondering: did you generate the two random numbers on the same machine just using different version of Python? I ask because I read "linux" and "linux2". If the random generator gives different numbers on different versions of Python that is bad :( I am looking for library to replace it. I know it is frustrating to fix those tests but hopefully they will protect us by unintended changes.

tcld commented 8 years ago

Just opened an issue: https://github.com/Mindwerks/worldengine/issues/145 Yes, same machine. But as can be seen different compilers.

The changes in the Python-module can probably be spotted and compared. I don't know about gcc, though.

Should I go through with the pushes and we sort this out later? I will switch out problematic parts in the binaries for the Python 2 versions so the Travis tests should work out at least.

tcld commented 8 years ago

Finally: Here https://github.com/Mindwerks/worldengine-data/pull/2 is the PR for the actual, final, eternal data that makes Python 2 pass all tests. (Python 3 doesn't like the ancient maps, everything else works - 38/40.)

If that PR is merged and the tests are rerun, things should work fine. I really, really hope. :)

tcld commented 8 years ago

How do you trigger a re-test? EDIT: I still don't know but I had to do another push anyway.

tcld commented 8 years ago

Yippie! :)

psi29a commented 8 years ago

Make it so @ftomassetti ? :) I'm happy with this.

tcld commented 8 years ago

What's the situation here?

ftomassetti commented 8 years ago

Fine by me: all tests are passing :) Should we hit the merge button or is there anything else any of you want to check?

tcld commented 8 years ago

I didn't want to change anything about this branch anymore, no. The ones after might need some polish, though.

ftomassetti commented 8 years ago

Merged, now we can see what adaptations the other PRs require