Closed damianooldoni closed 2 years ago
@damianooldoni thanks! Indeed, almost there! Just to be sure, to plot a network in either lines or polygons, the code after mapView()
is used and when using a mix, the leaflet()
chunk should be applied?
I just ran the code on the pbarn_freshwater
and ws_bpns
mix. I now get an error at this code-chunk:
study.area.binary <- shape.to.binarymask(
shape.study.area = study.area,
receivers = projections.locations.receivers,
nrows = nrows,
ncols = ncols
)
Note that this code generated a code anyway (bug in R), but this is a different error and the code stops running after the error:
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'rasterize': error in evaluating the argument 'x' in selecting a method for function 'addAttrToGeom': conversion from feature type sfc_GEOMETRY to sp is not supported
I think your problem mentioned in #72 has been solved by the very last commit? I don't know, I got no errors from shape.to.binarymask()
Strange, I still get the error However, no last commit came through.
Well, in your error I can read this:
[...] function 'rasterize': error in evaluating the argument 'x' in selecting a method for function 'addAttrToGeom': conversion from feature type sfc_GEOMETRY to sp is not supported
the error seems to happen in receiver_distance_fun.R#L291.
Indeed, sfc_GEOMETRY
is the class of the column geometry
in the study area, which is a sf data.frame. As it cannot be set to sfc_MULTIPOLYGON
/ stc_POLYGON
or sfc_MULTILINE
/sfc_LINE
, R set the combined study area geometry to something generic called sfc_GEOMETRY
.
But it seems strange because in that line I pass as_Spatial(shape.study.area)
to rasterize
so the conversion to a Spatial object happens using the standard as_Spatial()
function from sf
package.
I think you pulled my commit(s) but you forgot to source receiver_distance_fun.R
again before running it. Can you try it again, please? If you still get the error, try this in your console:
test <- as_Spatial(study.area)
and let me know if it works flawlessly as on my laptop.
Strange... I restarted R and sourced the function again. Since that didn't work, I updated R to the latest version (4.1.0) but the error still persists. Could you sync with Github to be sure I am not missing any changes? Also, when running your suggested test, I get the same error:
> test <- as_Spatial(study.area)
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'addAttrToGeom': conversion from feature type sfc_GEOMETRY to sp is not supported
Thanks @PieterjanVerhelst.
I checked my code better, and indeed, I found a bug. Now I get the same error as you. Please pull again.
Indeed, the error is that we cannot make a Spatial Polygon from a simple feature data.frame where the geometry is of type sfc_GEOMETRY
:disappointed:
Hmm, still get the error... So on your pc the code runs perfectly?
No, as I said, I get the same error as you now.
My idea is to proceed by rasterizing the two sf data.frames apart and then merging them in the function, something similar I did in other functions today.
ow wait, something is running!
Aah no, false alarm. It just took a bit more time to return the error π .
@PieterjanVerhelst: it's impossible it works. As I wrote above I understood the error, it's something again related to the fact that merging polygons and lines is something spatial packages in R don't like it very much. I will work on this, I hope to get it solved soon.
Okay, thanks π !
Not so much time today to work further on this. Still, tried merge() but didn't work as this is not what we need! Now I found actually need a mosaic function. I will then follow this tutorial about making a mosaic of rasters.
This has to be implemented in function shape.to.binarymask()
.
Stay tuned. :+1:
I think I solved the problem. In prepare_dataset.R
, when you have to transform the shapefile to a raster via shape.to.binarymask()
, you have to choose:
I also removed two transformations to sp objects via sp_Spatial()
where not needed in control.mask()
and adapt.binarymask()
.
@PieterjanVerhelst: give a try, thanks.
I checked the new code and it seems to provide a realistic output π ! I will test it more thoroughly the coming days on the various datasets/projects to be sure.
Just a brief question, but as you state the value for res
should be interpreted as a square, that means res = 10000
is 100 by 100 metres?
Thanks @PieterjanVerhelst: leave this PR open then until you performed all checks. I tested with a classic uniform study area (e.g. frome
and stour
) and it returned no errors and I think the same distances as before.
About resolution: NO, the resolution of a raster is the side of the cell. So, if you give res = 10
you will have a square which is more or less 10x 10. I say more or less, as it depends on the extent of the raster and rounding effects. So, in case of study.area <- stour
you will get:
> study.area.binary.extended
class : RasterLayer
dimensions : 513, 2333, 1196829 (nrow, ncol, ncell)
resolution : 10.00025, 9.990898 (x, y)
extent : 360408.7, 383739.3, 5754536, 5759662 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=31 +datum=WGS84 +units=m +no_defs
source : memory
names : layer
values : 0, 1 (min, max)
As you can read above, the resolution is very near to 100 x 100.
Moreover, the real resolution is returned on console while running shape.to.binarymask()
. I want always to inform the user about these things :+1:
I hope everything is clear now. In case it is not, just keep posting! :postbox:
Thanks for the clarification! BTW, I tested the code on a mix of lines and polygons (belgian network: freswhater + ws_bpns) and it works. I will continue testing on the other study areas in Europe.
A brief question: is it a problem when polygones are not connected? Or does the resolution of the rasterization process bridge these 'gaps'? Here an example of a reservoir that should be one big polygon, but due to bridges, the reservoir is separated by different polygons:
The rasterization process removes these gaps automatically as the resolution of the raster is not so high to make the bridges appear. But if you increase the resolution really a lot you will get problems I think.
At the moment I don't connect patches automatically because sometimes I had river bodies made of two different entities. So, the number of patches was 2 and after adding receiver points you should expand the raster to connect all receivers and get the number of patches back to 2. That's the idea behind. Now, with the projected receivers this step is way less important as the majority of the projections is working perfectly and so there is no need to "extend" the river body actually.
Okay, that is what I thought, so perfect π .
I found one small bug in calculating the distances for the Belgian networks, more specifically at station s-11
. For some reason, the position of that station is not projected on the river line-shapefile, but on the river bank, causing no calculated distances for that station (value = Inf
). Here is a print screen of the map. The projected position on the bank is made more clear by putting my cursor on it, showing the s-11
field:
UPDATE: the same holds true for station bh-34
in the project 2012_leopoldkanaal
.
Also station stour-19
(from project stour
) is not projected on the river mid-line, but on the bank. However, it does generate a distance output for that station which may not be that far off the actual distance.
When calculating the distances for the project semp
in Lithuania, I get a memory error after running the following code:
cst.dst.frame_corrected <- get.distance.matrix(
binary.mask = study.area.binary.extended,
receivers = projections.locations.receivers
)
Error: cannot allocate vector of size 3.8 Gb
This is rather strange, because the semp
project is much smaller than for instance phd verhelst
.
Note that I restarted R a few times to check if that would clear calculating memory, but it did not solve the problem.
@PieterjanVerhelst: the memory errors pop out because you have still a lot of other water bodies loaded in R, I think. I suggest you to clean your workspace after running the workflow for few river bodies/receivers and restart R (within RStudio) after cleaning. In WIndows Taskmanager (Windows Taskbeher, see screenshot) you will see a sensible drop off of the used memory :+1:
To restart R within RStudio: Session -> Restart R (or just Ctrl+Shift+F10)
Indeed, that is what I tried a few times. Now I also ran the code on the VLIZ RStudio server which has more RAM power, but the error persists. Do you get an output on your pc?
Ok? I will make a fast test for semp
. At which resolution are you working?
okay! I work at res <- 50
Whiel running Γ dapt.binarymask()` step, I read:
Number of patches of binary.mask (river body): 3
This means that the river body is made of three different parts. Is this the case? As I wrote yesterday I do not connect river bodies if disconnected. So, if we start from 3 patches, after adding the receivers to the raster, we will end up to 3 patches.
However, the message printed right after:
The binary.mask (river body) is not connected. Extension needed
is confusing, I have to admit. As this is something I didn't do it in the function. This means that the last function, will never end as it tries to calculate distance along the rivers among receivers which are not connected, so generating memeory size error at a certain point.
So, how to proceed? My idea is the following:
""River body is disconnected. This could lead to crash while calculating the distance among all receiver pairs placed in different patches.""
Do you want connect the river bodies?(Y/n)
What do you think? Connecting the river bodies is not difficult as I do something very similar after adding receivers.
Indeed, three different river bodies are connected. Mostly, I do this in a GIS program to extract a single shapefile with my study area. In case of the semp
project, I tried it via R and did not thought the error was due to that. Your suggestion to proceed is indeed a very good one.
A side note: I have a few more study areas of which the polygons need to be connected. For example, the polygons meuse
and coastal_meuse
in the folder \data\Belgium_Netherlands
need to be merged first, before they are merged to the line shapefile albertkanaal_zeeschelde
. I just let you know this if this would impact your coding for the semp-issue.
Than the last issue I bumped on, is when trying to calculate the distances for the project noordzeekanaal
at res = 50
. When running the code to create the extended binary mask, I get an error that the data was not a vector type, but NULL:
study.area.binary.extended <- adapt.binarymask(binary.mask = study.area.binary,
receivers = projections.locations.receivers)
Error in writeRaster(study.area.binary.extended, "./results/study_area_binary", : object 'study.area.binary.extended' not found
Note that the project noordzeekanaal
takes an incredible time to run (ca. 6 hours). So perhaps better to test with a subset?
BTW, for the other projects, the code runs perfectly π
DAK superpolder
EMMN
2011 warnow
ESGL
2011 loire
2017 fremur
mondego
2004 gudena
2013 stour
2014 frome
2014 nene
And when the issue is solved with the stations s-11
and bh-34
currently not being projected on the river-line, another 3 projects will run perfectly as well.
So almost there π !
Thanks @PieterjanVerhelst for checking all and reporting all the errors.
I will take a look. First I will tackle the error about the disconnected river bodies. About the wrong projection of the receivers s-11 and bh-34, I cannot do a lot. I mean, I can check a little but I have no time now to go within the function dist2Line
of package geoshpere
which does the work.
@damianooldoni thanks π ! Regarding the stations s-11
and bh-34
, I suggest to change the coordinates a bit in the receivernetwork file so the stations are a bit closer to the river-line. Now they are too far off, leading to Inf
values. What do you think on this temporary fix?
Update: I will add another river-line in the shapefiles towards the stations bh-34
and s-11
to try to solve the problem instead of changing coordinates.
Update: issues on stations not being projected on the river midline of the shapefile have been solved by adding river segments towards those stations, so they are projected accordingly.
@damianooldoni I can merge the various rivers in GIS so you don't have to dig into this issue. I actually don't consider this 'merging' step to be part of the distance-calculation function. That leaves 1 rather big issue open: the one on the project noordzeekanaal
for which I can't find a solution.
Review summary: from the 18 projects for which I tried to calculate the distances (both lines, polygons and mix), an error occurred for only 2 π :
noordzeekanaal
project: only polygons in shapefile noordzeekanaal_polygons.shp
. See comment above : https://github.com/inbo/fish-tracking/pull/73#issuecomment-875328816. Note that this study area is rather big with a lot of detection stations ranging from The Netherlands to the Belgian part of the North Sea.
albertkanaal
project: this is a combination of a line shapefile (albertkanaal_zeeschelde.shp
) and a polygon shapefile (meuse_total.shp
). Here I get the error Error: Number of clumps (patches) not equal to initial number of patches
after running the following code:
study.area.binary.extended <- adapt.binarymask(binary.mask = study.area.binary,
receivers = projections.locations.receivers)
Thanks @PieterjanVerhelst for summarizing all bugs in a comment, it will ease my work! I will take a look but it will be next week.
No problem! I am still digging into the issue with the noordzeekanaal
, since an output is generated when I take a higher res. So if I can find a sollution, that issue may drop out by next week ;-).
About noordzeekanaal
I think I found the error. You validate only shapefiles before merging them together, but validation should always occur. For this reason I added a validation of study.area
after defining it (part of commit cf79779), in this way it's validated, no matter which shapefile is used.
I don't know which receivers you are using, but I am quite sure this is the bug.
give a try, please.
The other one, albertkanaal
, another day.... :smile:
I still get an error for the project noordzeekanaal
. Note that I use the receivernetwork_Noordzeekanaal.csv
. When running the code study.area.binary.extended <- adapt.binarymask(binary.mask = study.area.binary, receivers = projections.locations.receivers)
it returns the error Error in matrix(unlist(ini), ncol = 2, byrow = TRUE) : 'data' must be of a vector type, was 'NULL'
I assume a specific column is not filled in or present in this spatial dataset?
@damianooldoni I just got back from holidays so can continue testing again ;-). As mentioned above (comment of 8th July), I only get errors for the projects albertkanaal
and noordzeekanaal
.
BTW, I come and join the (EV)INBO forces from the 1st of September π .
Update: The issue with the project Noordzeekanaal
is solved and probably had to do with a few shapefile-sections that were not connected. However, the issue for the project 2013_albertkanaal
persists. This project usess a combination of a line shapefile (albertkanaal_zeeschelde.shp
) and a polygon shapefile (meuse_total.shp
). Here I get the error Error: Number of clumps (patches) not equal to initial number of patches
after running the following code:
study.area.binary.extended <- adapt.binarymask(binary.mask = study.area.binary,
receivers = projections.locations.receivers)
@damianooldoni this is the last issue, so when this is solved, we can merge the pull request and I will contact Hugo Flavio if he has interest to implement this functionality in his actel
package.
In the shapefiles for the project 2013_albertkanaal
there were also some connectivity issues, which I solved in GIS. So now the function runs for that project as well, so there are no more issues.
Two things to take in mind:
1) Make sure the shapefiles are connected (with waterways it occurs that road bridges lead to intersections)
2) For large shapefiles and networks, a memory issue can pop up. This can be solved by increasing the memory limit via memory.limit(size = )
Many thanks to find the solution for this! π Two things:
Conflicts solved. As expected, not very significant and easy to solve conflicts. @PieterjanVerhelst: please, check once again and approve the PR. Then you can merge and dance πΊ
Just ran a test on the albertkanaal-project and code runs perfectly. So here comes the merge. Next thing to do is arrange a meeting to have a drink π !
This PR solves #72 by using
rbind
instead ofgUnion()
for combining polygons and lines in a study area. I also converted all code using sf package wherever possible. Some raster related functions want SpatialLines or SpatialPolygons as input, so conversion to sp objects is done only where needed and it is internal to some functions, so hidden to most of the users.Combining polygons and lines is still not really recommendable. Even if
rbind
helps a lot, still, the resulting geometry is (sfc_GEOMETRY
) is not supported bymapview
for example.I also generalized the way function
load.shapefile()
select a subset of shapefiles. This function selected the shapfiles based on columnNAME
which was hardcoded in the function. In this PR I made it flexible:I had still no time to run all up to the end of the workflow, but I think it's getting quite far now.