New Tests; Several Bug-Fixes

tcld commented 8 years ago

I tried to add a test for the satellite-view (and added a couple lines of code to properly test the scatter-plot).

I think I might have chosen a bad time for this since @pangal might want to continue work on this himself...but I thought this might be a quick change. It isn't, sadly, so far I haven't gotten the code to pass the satellite-test. I don't really know why, but maybe somebody sees the flaw.

EDIT: Here the data-PR https://github.com/Mindwerks/worldengine-data/pull/12, even though this should not be merged right now due the failed final test.

EDIT2: If @pangal (or somebody else) wants to continue work on this, I could probably rebase this PR, so please don't hesitate. :)

EDIT3: This is now based on #163.

tcld commented 8 years ago

Here the test that fails for me (using the updated data):

self = <tests.draw_test.TestDraw testMethod=test_draw_satellite>

    def test_draw_satellite(self):
        w = World.open_protobuf("%s/seed_28070.world" % self.tests_data_dir)
        target = PixelCollector(w.width, w.height)
        draw_satellite(w, target)
>       self._assert_img_equal("satellite_28070", target)

tests/draw_test.py:175: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/draw_test.py:70: in _assert_img_equal
    % (x, y, blessed_pixel, drawn_pixel))
E   AssertionError: Tuples differ: (244, 244, 251, 255) != (250, 255, 265, 255)
E   
E   First differing element 0:
E   244
E   250
E   
E   - (244, 244, 251, 255)
E   + (250, 255, 265, 255) : Pixels at 39, 16 are different. Blessed (244, 244, 251, 255), drawn (250, 255, 265, 255)

No idea yet why this happens, but it is reproducible (I tried three times, the results were exactly the same every time).

ftomassetti commented 8 years ago

I am not sure about this error, I guess it could be due to extraction of random values but that is it

tcld commented 8 years ago

I think I traced down the problem:

r, g, b, a = target.pixels[x, y]  # target is a ImagePixelSetter
r, g, b = test[y, x]  # test is a numpy-array with no special data type (I assume int32)

Both "arrays" have been written using exactly the same code, their content at several pixels differs, though.

Error-message of the test (the first values stem from a proper PNG-file, the latter from the actively drawn test-image):

Blessed (244, 244, 251, 255), drawn (250, 255, 265, 255)  # ImagePixelSetter
Blessed (250, 255, 255, 255), drawn (250, 255, 265, 255)  # numpy-aray

So two issues crop up: An overflow is happening (3rd value) and the read-out of the ImagePixelSetter returns a different value than the numpy-array.

EDIT: No, it doesn't appear to be related to the randomized portion of the code, that looks fine. It is something within ImagePixelSetter that isn't acting as I would expect it to.

The code-bloc, by the way, looks like this:

    # Loop through and average a pixel with its neighbors to smooth transitions between biomes
    for y in range(1, world.height-1):
        for x in range(1, world.width-1):
            ## Only smooth land tiles
            if world.is_land((x, y)):
                # Lists to hold the separated rgb values of the neighboring pixels
                all_r = []
                all_g = []
                all_b = []

                # Loop through this pixel and all neighboring pixels
                for j in range(y-1, y+2):
                    for i in range(x-1, x+2):
                        # Don't include ocean in the smoothing, if this tile happens to border an ocean
                        if world.is_land((i, j)):
                            # Grab each rgb value and append to the list
#                            r, g, b, a = target.pixels[i, j]
                            r, g, b = test[j, i]  # just testing
                            all_r.append(r)
                            all_g.append(g)
                            all_b.append(b)

                # Making sure there is at least one valid tile to be smoothed before we attempt to average the values
                if len(all_r) > 0:
                    avg_r = int(sum(all_r) / len(all_r))
                    avg_g = int(sum(all_g) / len(all_g))
                    avg_b = int(sum(all_b) / len(all_b))

                    ## Setting color of the pixel again - this will be once more modified by the shading algorithm
                    test[y, x] = (avg_r, avg_g, avg_b)
                    target.set_pixel(x, y, (avg_r, avg_g, avg_b, 255))

tcld commented 8 years ago

Most of the problems came from over-/underflowing uint8s that were written to the PNG. But even after fixing those I am left with a failed test due to differing pixels (not caused by the RNG, not caused by rounding and not caused by over-/underflows).

I tried rebasing this PR to #163 while searching for the cause of the problem and that worked fine. Now I am a little bit unsure if it is worth the time trying to find the problem in a piece of code that I kind of wanted to see replaced in the first place. Would it be ok if I just put this on top of #163 (with some small adjustments)? (That question is especially directed at @psi29a.)

My latest findings are in here, although the (new) tests still cannot be passed.

psi29a commented 8 years ago

Put it on top of #163 , we we're ready I'll merge it in. Having pypng as an additional depend isn't a problem as it is pure python.

tcld commented 8 years ago

Ok.

tcld commented 8 years ago

I had to take care of some other things first. Rebase is now done, all tests are passed for me (Python 3).

@esampson: Would you mind taking a look at the scatter-plot I generated here? It looks quite different from the old one. It's not due to changes in the code (I checked using the old world, the output was the same); hopefully this is fine. EDIT I just remembered that you changed some things to make the points spread out more evenly. Maybe that's just what happened; the mentioned old world had been generated before you committed those changes.

Another remark: The precipitation-map now looks a lot better, the temperature-map is slightly less homogeneous. I don't really know where these differences come from compared to the old map, it might be due to some rounding that is performed. I only wanted to mention it since at some point @psi29a noticed that the those maps were a little too homogeneous and that this had been different in the past.

ftomassetti commented 8 years ago

Stupid question: why we inverted in almost all the arrays x and y? I agree we should be consisteng but wouldn't make sense to keep the order x, y?

tcld commented 8 years ago

There was a discussion about that, I think it is visible to you right in this PR (above your post). If you can't see it, I could summarize it.

ftomassetti commented 8 years ago

I realize you spent a lot of time inverting x, y to y, x and I really agree we should stick to one order in all the places, I just think as @pangal that x, y seems more natural than y, x. @psi29a any opinion on this?

tcld commented 8 years ago

Well, how about going for [y, x] now (at least it would be consistent) and then fully switching to [x, y] in a later PR.

The old arrays were all initialized with array = [[0 for x in range(width)] for y in range(height)] and then accessed via array[y][x]. When I switched to numpy I had to adhere to that or things would have probably exploded. The initial reasoning behind [y][x] might just be that [[0 for x in range(width)] for y in range(height)] is a very natural (i.e. western) way of looking at matrices: Rows are appended to form a full matrix (read from left-to-right, then top-to-bottom, as with books); the exact same way numpy does it (as the code-snippets above show).

We might just rename y to row and x to column. If we fully switched to array = [[0 for y in range(height)] for x in range(width)]... I can't wrap my head around this without lunch.^^

ftomassetti commented 8 years ago

I would normally be for splitting things among different PRs but changing all the array access and to change them immediately after seems to be way too complicate. I would say, let's just pick on way of access the arrays and stick with it. My preference is for [x, y] because it is more natural. Is there any strong reason to go for the opposite? Does it offer way better performances? Is the problem that [y, x] is more common?

I understand that when using separate indexes it makes sense to use [y][x] instead of [x][y]. Is in numpy arrays [x, y] just a shorctu for [x][y]?

On Mon, Nov 9, 2015 at 3:32 PM, tcld notifications@github.com wrote:

Well, how about going for [y, x] now (at least it would be consistent) and then fully switching to [x, y] in a later PR.

The old arrays were all initialized with array = [[0 for x in range(width)] for y in range(height)] and then accessed via array[y][x]. When I switched to numpy I had to adhere to that or things would have probably exploded. The initial reasoning behind [y][x] might just be that [[0 for x in range(width)] for y in range(height)] is a very natural (i.e. western) way of looking at matrices: Rows are appended to form a full matrix (read from left-to-right, then top-to-bottom, as with books); the exact same way numpy does it (as the code-snippets above show).

— Reply to this email directly or view it on GitHub https://github.com/Mindwerks/worldengine/pull/185#issuecomment-155079175 .

Website at http://tomassetti.me http://www.federico-tomassetti.it GitHub https://github.com/ftomassetti

tcld commented 8 years ago

The numpy-notation [y, x] is equivalent to [y][x].

Is there a strong reason to go with [x, y]? I found it useful to use [y, x] when printing a numpy-array for debug-purposes. x in that case refers to the "natural" x-coordinate. Other than that it probably doesn't really matter. Although a full switch will cause saves to become incompatible again, so it should be done sooner rather than later.

I think the benefits of this kind of cosmetic change are not worth the work. There are at least a couple lines of code in almost every source-file, very many lines in some, that would need to be changed. There is no speed to be gained and no significant simplification of code.

rbb commented 8 years ago

If you want to print a numpy array, with swapped rows and colums (in iPython for example), you can always use numpy.transpose().

For example:

myVar = numpy.rand(x,y)
print( myVar.transpose() )

ftomassetti commented 8 years ago

ok, you have convinced me, I am fine with [y, x] unless someone else has strong opinions about that

j-coppola commented 8 years ago

I strongly believe it should be [x, y] to keep convention with standard Cartesian coords - but I understand that it's a big undertaking to rewrite it, so I'm ok with [y, x] if it's consistent throughout the codebase :)

psi29a commented 8 years ago

Agreed with @pangal pangal over being consistent.

We are mixing Image nomenclature with Matrix nomenclature, Numpy uses the Matrix (I,J) nomenclature. https://en.wikipedia.org/wiki/Matrix_(mathematics)

 Z = np.zeros((2,4))
print(Z)
[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]

while Images (on screen) use the X,Y (top left to bottom right).

We shouldn't be confusing it more with Cartesian since we are not using graphs. ;)

As @tcld said, when dealing with Images, we should take the Numpy array and transpose it. However all work should remain in Numpy native (I,J) matrix format.

tcld commented 8 years ago

Ah, I was looking for something like this but didn't find the right keyword.

Although: Please define "when dealing with images". :P

psi29a commented 8 years ago

I meant, that when we go about transforming our matrix data into image data to be written out to disk we can transpose the numpy matrix as @rbb mentioned above. Or do you consider this unnecessary?

If we can hammer this out, we can get WE into a more stable/common practice state.

This would be something to mention in our documentation as well and in comments for people who come after us. :)

tcld commented 8 years ago

So the only place where image-notation should even be considered is inside the class that handles writing/reading images? I just wanted to be very clear about it. (Transposing right before writing isn't necessary, by the way, since the arrays need to be flattened first anyway. Same goes for reading.)

I could add some comments about this to the image-handler classes, but since so many places are affected by this choice, a tiny section in the documentation is probably important.

psi29a commented 8 years ago

You're correct and right! Comments in code is the way to go. :)

psi29a commented 8 years ago

This needs to have the manifest updated and images to test...

tcld commented 8 years ago

The images have been ready for a while. Updated manifest? Why that?

psi29a commented 8 years ago

Without an updated manifest, we can't make a release. https://travis-ci.org/Mindwerks/worldengine/jobs/90341676

^-- points out which files are missing. If it has nothing to do with python or the library's functionality, I usually just ignore it.

tcld commented 8 years ago

Oh. I mentioned that before: https://github.com/Mindwerks/worldengine/pull/179

I don't know how to properly fix it, and it's not related to this PR.

psi29a commented 8 years ago

Right. I'll take care of it when I get the chance.

toddcarnes commented 8 years ago

OK. I have to ask. What does PR stand for in this context? On Nov 12, 2015 08:06, "tcld" notifications@github.com wrote:

Oh. I mentioned that before:

179 https://github.com/Mindwerks/worldengine/pull/179

I don't know how to properly fix it, and it's not related to this PR.

— Reply to this email directly or view it on GitHub https://github.com/Mindwerks/worldengine/pull/185#issuecomment-156150063 .

tcld commented 8 years ago

Whenever I use it, it stands for Pull Request. When others have been using it, I have always assumed the same and that seemed to work so far.

toddcarnes commented 8 years ago

Thank you On Nov 12, 2015 08:39, "tcld" notifications@github.com wrote:

Whenever I use it, it stands for Pull Request. When others use it, I have always assumed the same and that seemed to work so far.

— Reply to this email directly or view it on GitHub https://github.com/Mindwerks/worldengine/pull/185#issuecomment-156159880 .

psi29a commented 8 years ago

Some people say Pull Request (PR) and others Merge Request (MR). Github has the the workflow that it will ask to merge your pull request if you have the right privileges.

tcld commented 8 years ago

I rebased this and now the changes are kind of trivial. :)

tcld commented 8 years ago

How about this one, @psi29a and/or @ftomassetti ? It's only about minor things, but it has been done for a while. Data: https://github.com/Mindwerks/worldengine-data/pull/12

EDIT: I also rebased it and made the necessary changes.

tcld commented 8 years ago

The changes to lines 400 to 450 in draw.py are actually important; I am not sure why they weren't part of the PR that previously unified the matrix-accessors to [y, x]. Maybe I overlooked it after rebasing onto the satellite-commit.

ftomassetti commented 8 years ago

A few comments added but in general it looks good

tcld commented 8 years ago

I took care of one of the comments but didn't know what to do exactly about describing target.

ftomassetti commented 8 years ago

I am running tests locally together with new data

ftomassetti commented 8 years ago

Tests passed locally. Restarting them on Travis.

Mindwerks / worldengine

New Tests; Several Bug-Fixes #185

179 https://github.com/Mindwerks/worldengine/pull/179