Closed p-a-s-c-a-l closed 5 years ago
I would like to take care of the grid with land usage percentages generation.
My only doubt is which prioritize, input layers script correction to have well computed differences between layers since right now it is not correct (as @helllth pointed in https://github.com/orgs/clarity-h2020/teams/data-processing-team/discussions/1) Or automatize the generation of this grid regardless of having or not correct data right now.
Hello to everyone,
As reported in the excel file, in which is described the model: • “Direct shortwave radiation = 90% di Global shortwave radiation”; • “Diffuse shortwave radiation = 10% di Global shortwave radiation”; • Global shortwave Radiation should be taken from the site https://meteoexploration.com/products/SolarCalculator.html, considering the 21th of June at 12:00 - • sun's altitude angle above the horizon: the formula is included in the sheet of the excel file called “η” • sun's azimunth angle: the values are reported in the excel sheet tagged “θ” Regarding the surface temperature, Luis defined a relation connecting land use, FUA and surface temperature. Moreover, Luis implemented the HW local effect related to Naples.
After our todays developers telco, I have a question regarding land use grid generation. Using europen 500m grid I will get the corresponding percentages of land using in each cell by using input layers (water,roads,etc...) but this is ok for heat wave hazard but I guess I would have to do something similar for pluvial floods, so I have to create a different grid with... which input layers (streams, basins and other land use layers? which ones? Maybe I can use the same grid for both hazzards?
I need a clarification on how to handle this from PLINIVS, maybe @alecapolupo can help me.
I agreed with Miguel Angel to create a table referencing european laea etrs 500m cells which adds percentages for each land use including streams and basins. Percentage values will be Postgresql decimal(3,2). ( 0.31 for example), if more precision is needed, please let me know.
Hi Mario, surely you should apply the same grid for both hazards (HW and FL) since land use is the most important parameter in both cases. Actually streams and basins will not be affected by adaptation options implementation. Nevertheless, I'm agreed with the solution suggested by Miguel.
Nice, I will do that. I have another issue with this grid generation because I got overlapping areas of 0,001 or even smaller than that. In the end such intersections are translated as percentages of 0,00 (so if I limit percentages to 2 decimal values or if I do not force a minimum overlapping area, then I get such 0,00 percentages that are meaningless) Any idea on how to handle this? I can force a minimum percentage value, what do you think?
For instance in Napoli and water bodies I have: Intersections of water <0,01%of a cell = 129 times Intersections of water >0,01% a cell = 389 times
I mean ... maybe a lot of meaningless results?
you could limit the percentage to 2 decimal values just for visualization purpose but you should store all the information. For instance you should store 0,001 and not 0,00.
then I will consider every result meaningful. thank you.
you are welcome!
After some testing on percentages calculation of a specific land use in a specific cell I noticed that sometimes I have more than 100% of a cell ocuppied by polygons of the same land use (vegetation for instance) and this make no sense. Thinking about it the only reasonable explanation I found for such a thing is vegetation have several polygons in that cell but at the same time those polygons are overlapping, so I am adding the same area more than once. So, there are two possible solutions: -try to remove overlapping areas when I found more than one polygon of vegetation in the same cell -try to cleanup the whole layer by merging overlapping areas of vegetation
For me it is easier to do this once when I insert vegetation data into database and forget about this problem in any further calculation but I do not know which criteria will follo a command doing the merge, I mean, two polygons sharing area will become one single polygon? or which of the two polygons will delete the overlapping area and which do not?
Any advice regarding this will be very helpful. Thank you in advance.
Sorry Mario but I have a doubt about this sentence "After some testing on percentages calculation of a specific land use in a specific cell I noticed that sometimes I have more than 100% of a cell ocuppied by polygons of the same land use (vegetation for instance) and this make no sense." Could you explian me please? What do you mean with "specific land use"? Surely, your final outcome is affected by the overlapping areas. Therefore, the result will change once you will solve all the problems.
I mean i have several vegetation polygons in the same grid cell, so if I add all areas of those polygons they should give me a percentage of area of the total cell. The problem is, is some vegetation polygons are overlapping then I am adding several times the same area, this way I can have a percentage of vegetation in the cell higher than 100%
I am just guessing, I suspect that is the problem, but I am still doing some checkings, my questions are: 1) is that possible? I mean it could happen as Urban atlas data is? 2) how to solve it?
I found the next circunstance:
As you may see the same cell numbered as 10272104 intersect with 6 different vegetation polygons with identifiers 28355,1447,27485,28000,28390 and 29002.
Each polygon has an area of intersection (except one, that is 0, I guess this one only touches the cell boundaries) and the total area of those polygons from vegetation is 281.420 which is greater then the cell itself (250.000) so percentage is above 100% (around 112%)
So this has to be solved somehow and I have two ideas: 1- clean up vegetation layer frmo the very beginning to avoid this from happening, but I do not know which criteria to follow when several polygon overlaps, what should I do merge all of the into the same one? --> here is my question, how can i do this? it would be fine to do such a thing? 2- try to get a more complex query taking more time to substract each possible intersection between all those polygons in the same cell. --> a harder way to fix this, indeed we can find a similar problem in other calculations implying equally harder workarounds, so for me is no the best way
You can "dissolve" the overlapping geometries of the same categories.
what do you mean with 'dissolve' and how can i do that?
Now I am trying a more complex query, trying to get the percentage of vegetation in the cell by doing the union of the different vegetation geometries in a cell. But this is taking a lot of computation effort.
And I am afraid I am getting weird errors: "ERROR: lwgeom_intersection: GEOS Error: TopologyException: Input geom 0 is invalid: Ring Self-intersection at or near point 4650860 1965060 at 4650860 1965060 SQL state: XX000" that I do not know how to handle...
This is the tool I mentioned before:
"ERROR: lwgeom_intersection: GEOS Error: TopologyException: Input geom 0 is invalid: Ring Self-intersection at or near point 4650860 1965060 at 4650860 1965060 SQL state: XX000" that I do not know how to handle...
Perhaps the function ST_MakeValid(Geometry) can help you, if you use this function on your input geometries.
This is the tool I mentioned before:
ST_MakeValid(
Thank you, I will try but there is some nested functions with geometries as inputs and outputs. Lets see.
This is the cell 10272104 in a map and as you may see there are several geometries overlapping. when I try to do the dissolve I get:
Algorithm 'Dissolve' beginning ... Input parameters: {'FIELD': [], 'INPUT': 'pagingEnabled = \' true \ 'restrictToRequestBBOX = \' 1 \ 'srsname = \' EPSG: 3035 \ 'typename = \' clarity: vegetation \ 'url = \' http : //services.clarity-h2020.eu: 8080 / geoserver / clarity / ows? \ 'version = \' auto \ 'table = \ "\" sql =', 'OUTPUT': 'TEMPORARY_OUTPUT'}
The object (25355) has an invalid geometry. Please correct the geometry or change the Process configuration to the option "Ignore invalid input objects".
I am testing a workaround on the calculation by getting the union of the vegetation polygons intersections with the cell. But sadly I am getting errors: "ERROR: GEOSUnaryUnion: TopologyException: Input geom 1 is invalid: Ring Self-intersection at or near point 4687350 1980280 at 4687350 1980280 SQL state: XX000"
Anyway I think it would be better to achieve doing that dissolve when extracting data from Urban Atlas. As long as I understood that can be done with Postgis but I still do not know how.
You cannot do any operation on invalid geomatries. Therefore, you should check the geometries validity, identify which ones are invalid and fix them.
That sounds fine but how? I mean you are right but what means exactly... "invalid"?
Finally I made my query work, but it takes long to process, for vegetation it took 50min. I am trying to generate the whole land use grid with this query. Lets see.
Anyway I think a cleaning or simplifying of the data is needed after data is extracted and combined from the different data sources (UA/STL/ESM) and before doing any further operation with it.
As you may see in the next picture, there is A LOT of overlapping for the vegetation layer and this could happen in many other layers.
So I am not sure how to achieve doing that but I think it should be done with something like:
SELECT (ST_Dump(geom)).geom FROM (SELECT ST_Union(geom) AS geom FROM tmpTable) sq;
I am not sure how to ... this make sense to you?
I managed to load layers 1-8,11 in the grid and percentages look correct but layers 9,10,12 gave an error:
ERROR: lwgeom_intersection: GEOS Error: TopologyException: found non-noded intersection between LINESTRING (4.652e+06 2.00256e+06, 4.65179e+06 2.0025e+06) and LINESTRING (4.65195e+06 2.00254e+06, 4.65196e+06 2.00255e+06) at 4651950.8172455253 2002544.7905081783
SQL state: XX000
I have to check what is the problem... Layer of the grid is alrady published on Geoserver: http://services.clarity-h2020.eu:8080/geoserver/clarity/wms?service=WMS&version=1.1.0&request=GetMap&layers=clarity%3Aland_use_grid&bbox=4650000.0%2C1955500.0%2C4721000.0%2C2008000.0&width=768&height=567&srs=EPSG%3A3035&format=image%2Fgif
Hi Mario, some comments from my side:
what do you mean with 'dissolve' and how can i do that?
I think you've already found, it maps to ST_UNION in POSTGIS: but when applied on all the layer features geometries returns one probably huge geometry (multi). For getting rid of overlapping parts, I'd try to make an ST_UNION of ST_OVERLAPS geometries for those features with different features id (I'm sorry but I'm not in the condition to produce a more detailed example just now, but I hope you can get the concept).
Ring Self-intersection
When I get errors like this I usually find applying a zero distance (or a small distance one) buffer can fix the geometries, or at least help and reduce the number of problems a lot.
Hi Stefano, I think I managed to use ST_UNION as you pointed, but regarding the "RING SELF-INTERSECTION" I am not sure what you mean, which operation with a small buffer should be applied? I found this but I am not sure to understand it:
This will give you one huge MultiPolygon, which probably isn't what you want. You can get the individual dissolved components with ST_Dump. So:
SELECT (ST_Dump(geom)).geom FROM (SELECT ST_Union(geom) AS geom FROM tmpTable) sq;
This gives you a separate polygon for each set of touching inputs, but groups of inputs that were separated by a short distance will remain as separate geometries. If you have access to PostGIS 2.2.0rc1, you can merge geometries that are close together into a single GeometryCollection using ST_ClusterWithin:
SELECT unnest(ST_ClusterWithin(geom, 0.0001)) AS grp FROM tmpTable;
thank you!
Hi Mario
regarding the "RING SELF-INTERSECTION" I am not sure what you mean, which operation with a small buffer should be applied?
I mean that when you have those kind of errors on a layer, if you update all the features geometries with an st_buffer(geom,0) or a small distance buffered version, most if not all errors should be cleaned.
Nice, thank you. I will give it a try. cheers.
I have been testing during the weekend:
GEOSUnaryUnion: TopologyException: Input geom 1 is invalid: Self-intersection at or n
ear point 4666041.8885864411 1987698.4477502776 at 4666041.8885864411 1987698.4477502776
This is where I ended. My last try was running this for about 20 hours to merge intersections in the table/layer:
nohup psql -U "postgres" -d "clarity" -c "insert into vegetationb (geom)
SELECT ST_Union(ST_ClusterIntersecting(geom)) FROM vegetation2;"
&>./resultado &
But before running it I did some "cleaning":
update vegetationb l2
set geom=St_MakeValid(St_Multi(St_Buffer(l.geom,0.0001)))
from vegetation2 l
where l.gid=l2.gid;
To try to stabilize the geometries in the table I was going to use.
Any help or advice will be appreciated. I already asked Meteogrid for some help.
SELECT ST_Multi(ST_Union(f.geom)) as singlegeom FROM vegetation as v, class40_napoli as f where ST_intersects(f.geom,v.geom);
...
ERROR: array size exceeds the maximum allowed (1073741823) SQL state: 54000
Everything I try gives a weird error...
PostgreSQL has a single field size limit of 1GB ...
I think that was wrong, probably I was generating one geometry with all. Now I am trying:
insert into vegetationb (geom) (
SELECT ST_Multi( (ST_Dump(ST_Union(a.geom,b.geom))).geom ) as geom
FROM vegetation a ,class40_napoli b
WHERE ST_Intersects(a.geom, b.geom) AND (a.gid < b.gid)
);
This should generate geometries which intersect others, so adding this to those not intersecting will make the whole vegetation. Anyway I am not sure it worked correctly... but it is close to what we need.
It looks like layer generated by this las query has overlappings.... so I changed it to:
insert into vegetationb (geom) (
SELECT ST_Multi( (ST_Dump( ST_Union(ST_Intersection(a.geom,b.geom),ST_SymDifference(a.geom,b.geom)) )).geom ) as geom
FROM vegetation a ,class40_napoli b
WHERE ST_Intersects(a.geom, b.geom) AND (a.gid < b.gid)
);
But still looking like having overlapping areas... and each time I click on a polygon checking geometry seems to be several overlapping....
This new layer is only getting intersections but trying to obtain single geometry iinstead of having several overlapping. This means having polygons from UA+ESM but avoiding repeating overlapping areas by havin one single polygon as the addition of both ESM+UA. So this layer is still missing those polygons from UA and ESM that not overlapp each other, but it is a test to know if overlappings are being properly generated, that is the point and that is what I would like to know, cna PLINIVS help here please?
In Atos Geoservers: http://services.clarity-h2020.eu:8080/geoserver/clarity/ows?
-vegetation (contains only data from UA): -class40_napoli (contains only data from ESM): -vegetation2 (old one, incorrect, has overlappings... with all from UA+ESM) -vegetationb (new one, probably incorrect.... generated as result of applying query above), as oyu can see in the previous post image, the bottom right window shows what is selected and highlighted in orenge (in the map) at depicts 6 geometries, so it looks like overlappings where not merged, am i right?
Today I am working with Alberto from Meteogrid and we are making some small progresses. FInd out that some spatial indexes were missing and checking why we are getting those weird errors.
Nice help from Meteogrid side :D
FInally I can say, problem is solved. After a long discussion and testing with my colleague Daniel we suceed on generating land use per cell as a single polygon, this approach avoids lot of problems and I hope still fits project needs.
This means I can generate not self-overlaping layers, so I am able now to include in all my scripts for input layer generations a new query cleaning any issue between geometries of each single layer being generated. This would avoid a lot of problems in layers like vegetation with data form different data sources (UA,ESM,STL).
At the same time this will enhance land use grid calculation which is already solved but was not generating correct results because of this issue (self or not self ovelapping).
Now the only thing pending is to resolve the not self overlapping, but I guess it will be easier with this approach. This is computing differences between input layers on its generation to ensure there are not overlappings between them.
This approach means now input layers would not come as original geometries but aggregated geometries by european grid cell like the image, this ensure any layer will come to a number of geometries limited to the number of european grid cells for specific city, in case of Naples I have 10.000 cells while layers up to 283.000 so now it is reduced to up to 10.000 geometries per layer, which makes any further computation much more efficient since I could rely on usual JOINS instead of spatial operations.
I hope this wont be a problem, can PLINIVS confirm this is ok to proceed with this strategy?
Alberto from Meteogrid detected that UA2006 brings incorrect geometry data in some cases so I am evaluating if try to do some Postgis operations to try to fix it or just delete a geometry which is not correct.
So first of all I will try to add such checkings to my script, but IMHO trying to correct could solve or not the issue, but if it doesnt it will ruin any kind of afterwards calcualtion so deletion sounds better, but that is something that I would like PLINIVS to agree before doing it.
After some really qiuck test for 1 single cell and vegetation, I got dissappointed when I ran it to whole Napoli area and took for vegetation (99627 polygons) around 9-14 hours to complete the query
Regarding computing differences between input layers, I finally came to a solution, but each layer is a bit different because of different data sources and some problems are rising but things are getting better finally. For now I did 4 out of 12 and trying to fix problems on layer 5. But I only expect further problems in layers 7 and 8. So I hope to come back to you soon with a "all is finally done".
Finally I got rid of the annoying geometry errors and seems like I generated a correct dataset but I need someone from PLINIVS to confirm it. It willl be avaiable during tomorrow in ATOS Geoserver since I am currently generating some of the layers.
Good! When it is ready, let us know and we will check them.
... this is crazy, yesterday it worked, now it is not... I am still getting: ERROR: lwgeom_difference: GEOS Error: TopologyException: found non-noded intersection between LINESTRING (4.69213e+06 1.97617e+06, 4.69213e+06 1.97619e+06) and LINESTRING (4.69213e+06 1.97617e+06, 4.69213e+06 1.97618e+06) at 4692129.9998000003 1976179.5260808717
Looks like geometry issues have been finally solved, I am still testing but it is working for Naples and I would like to load Linz and Stockholm.
Ok Mario, let me know when I can check them.
Please Alessandra take a look on it to know if everything is fine.
DC 1,2,3 cities are loaded (Napoli, Linz and Stockholm), this means 12 heat wave layers and 2 pluvial flood layers. Also land use grid percentages have been generated for all three.
STOCKHOLM
LINZ
NAPOLI
As a curiosity, Napoli vegetation is not the one with the highest quantity of polygons... but for some reason it is the heaviest layer to process, it takes around 7-8 hours.
I also loaded Stoke on trent (UK) for being a small city and only just for test purposes. I will delete it once PLINIVS confirm all is correct.
Some statistics of generated data:
City ID value (next to city name), for internal use only. Also layer legend (colors used in Atos Geoserver for each corresponding layer) It is curious that each city is in extension more or less double size compared with the previous one.
We are still missing some layers for being able to calculate the Local Effects:
We need a resource (in the form of gridded data) that indicates the percentages (in each cell) of the emissivity, etc. --> action(METEOGRID+ATOS)