tum-ens / pyGRETA

python Generator of REnewable Time series and mAps
GNU General Public License v3.0
38 stars 14 forks source link

Error in creating population map #164

Closed simnh closed 4 years ago

simnh commented 4 years ago

In the source code, the population count is used, however the link in the docs points to the density.

This should be the correct link https://sedac.ciesin.columbia.edu/data/set/gpw-v4-population-count-rev11/data-download

I think...

simnh commented 4 years ago

Sorry, I realised that the function "generate_population()" has to work with density. I wonder howerver, why the file name in config.py is the filename of the count?

simnh commented 4 years ago

Also I was wondering, why the line 491 is necessary see here This causes an broadcast error. Maybe this is due to a problem with the newer version of data set (rev11) that I downloaded for the population? But I doubt that the newly revised version should not be a problem...

simnh commented 4 years ago

This is the error message I get:

Traceback (most recent call last):
  File "code/runme.py", line 18, in <module>
    generate_maps_for_scope(paths, param)
  File "/home/admin/projects/pyGRETA/code/lib/input_maps.py", line 25, in generate_maps_for_scope
    generate_population(paths, param)  # Population
  File "/home/admin/projects/pyGRETA/code/lib/input_maps.py", line 491, in generate_population
    A_POP[600:18000, :] = A_POP_part
ValueError: could not broadcast input array from shape (21600,43200) into shape (16800,43200)
kais-siala commented 4 years ago

Sorry, I realised that the function "generate_population()" has to work with density. I wonder howerver, why the file name in config.py is the filename of the count?

Hi, It is pretty much the same thing. The population count per pixel is equivalent to a density (per pixel, not per unit of area, maybe that is the cause of the confusion). I admit that we usually use the term density as relative to the area in km². Here I meant density per pixel.

Kais

kais-siala commented 4 years ago

Also I was wondering, why the line 491 is necessary see here This causes an broadcast error. Maybe this is due to a problem with the newer version of data set (rev11) that I downloaded for the population? But I doubt that the newly revised version should not be a problem...

This could be. I remember that the raster I have has no values beyond a certain latitude. Maybe it is different in v11. I will download it and check the dimensions, then get back to you. Or maybe you can tell me what is the size of the raster you downloaded?

simnh commented 4 years ago

This is the shape of the read population file v11:

(Pdb) A_POP_part.shape
(21600, 43200)

So it seem that it is not only a part of shape but actually the same size of the A_POP array, right?

kais-siala commented 4 years ago

You are right, so you should remove that line!

simnh commented 4 years ago

Well this seems to work, however I now get the following problem (which I also get for some other functions):

Traceback (most recent call last):
  File "code/runme.py", line 18, in <module>
    generate_maps_for_scope(paths, param)
  File "/home/admin/projects/pyGRETA/code/lib/input_maps.py", line 25, in generate_maps_for_scope
    generate_population(paths, param)  # Population
  File "/home/admin/projects/pyGRETA/code/lib/input_maps.py", line 492, in generate_population
    A_POP = resizem(A_POP, 180 * 240, 360 * 240) / 4  # density is divided by 4
  File "/home/admin/projects/pyGRETA/code/lib/util.py", line 157, in resizem
    reshape(reshape(repmat((A_in.flatten(order="F")[np.newaxis]), row_rep, 1), (row_new, -1), order="F").T, (-1, 1), order="F"), 1, col_rep
MemoryError: Unable to allocate array with shape (933120000,) and data type float64

I guess this is just a resource problem on my local machine, so I will move this to a machine with more resources.

On what machine (memory etc) are you usually running the code?

simnh commented 4 years ago

What is happening in this line (resize, and divide by 4):

A_POP = resizem(A_POP, 180 * 240, 360 * 240) / 4  # density is divided by 4
kais-siala commented 4 years ago

Well this seems to work, however I now get the following problem (which I also get for some other functions):

Traceback (most recent call last):
  File "code/runme.py", line 18, in <module>
    generate_maps_for_scope(paths, param)
  File "/home/admin/projects/pyGRETA/code/lib/input_maps.py", line 25, in generate_maps_for_scope
    generate_population(paths, param)  # Population
  File "/home/admin/projects/pyGRETA/code/lib/input_maps.py", line 492, in generate_population
    A_POP = resizem(A_POP, 180 * 240, 360 * 240) / 4  # density is divided by 4
  File "/home/admin/projects/pyGRETA/code/lib/util.py", line 157, in resizem
    reshape(reshape(repmat((A_in.flatten(order="F")[np.newaxis]), row_rep, 1), (row_new, -1), order="F").T, (-1, 1), order="F"), 1, col_rep
MemoryError: Unable to allocate array with shape (933120000,) and data type float64

I guess this is just a resource problem on my local machine, so I will move this to a machine with more resources.

On what machine (memory etc) are you usually running the code?

I usually run it on a server with lots of memory (512 GB) and CPU power, to get fast results. You can run the code on a normal computer, but not for a large geographic scope (continent), as you have noticed in your case.

kais-siala commented 4 years ago

What is happening in this line (resize, and divide by 4):

A_POP = resizem(A_POP, 180 * 240, 360 * 240) / 4  # density is divided by 4

The original dataset I used has a resolution of 30 arcsec, and I was bringing every raster to 15 arcsec. So one pixel in the original dataset covers four pixels in the new dataset with the desired resolution. Hence, each one of them has 1/4th of the population size.

simnh commented 4 years ago

Works now!