Closed PieterjanVerhelst closed 3 years ago
This error raises when running this commando at line 95:
study.area.binary.extended <- adapt.binarymask(study.area.binary, locations.receivers)
Debugging this, I found that the problem is in the for loop in help function extend_patches()
(receiver_distance_fun.R#L146-L148):
for (i in 1:ncol(crdnts) ) {
inputmat[(crdnts[1, i] - 1):(crdnts[1, i] + 1), (crdnts[2, i] - 1):(crdnts[2, i] + 1)] <- 1
}
Within this function, I find:
Browse[2]> nrow(inputmat)
[1] 2000
Browse[2]> ncol(inputmat)
[1] 4000
Browse[2]> max(crdnts)
[1] 4000
Browse[2]> crdnts[2,7184]
[1] 4000
This means that in the for loop, crdnts[2, i] + 1
becomes 40001 and this is indeed out of bound as number of columns of inputmat
is 4000.
I didn't sovle the bug, but I found the problem, it's 50% of the work! :factory_worker: I will investigate further.
@PieterjanVerhelst : the links are to master, but I am working on branch distance-frome
of course and anyway, the help functions are the same as in master.
Great, thanks!
@damianooldoni I have been trying some stuff out lately, but couldn't make much improvement. However, I noticed that the River Frome shapefiles are SpatialLinesDataFrame
while the files on which the code was developed were SpatialPolygonsDataFrame
. Could that be part of the problem as well, specifically that SpatialLinesDataFrame
standard have a larger resolution? And if so, can the code be made flexible to allow and even combine both dataframe types?
I assume this may take some time, let me know how you want to proceed with this. The calculation of the station distances is the first step in a European-wide meta-analysis I am conducting on eel migration. Hence, as soon as this issue is solved, the sooner I can start digging into the data.
Thanks @PieterjanVerhelst for the hint. I didn't write this code and I probably have to spend some more time to understand better the details of it. Stijn documented the functions but not every single step of course. It can be I contact you by email to get some more background if needed. I hope I can find a easy but complete solutions for this.
I think I solved the bug. The problem was due to the fact that one of the receivers was at one the edges of the raster. Can you check whether the distances between receivers makes sense?
See commit mentioned here above for more details.
I have also cleaned a little bit the code (typos, some variable names, ...) and used sf
objects where possible and converting them on-the-flow to sp only for functions where sf objects are not supported. This actually happens only once in costDistance()
, a function of gdistance
package.
If using sf objects will arise errors in other parts of the workflow, no worries. Just let me know via a separate issue and in any case, we can always convert them to sp
points/polyongs by using the as()
function from sf
package:
sp_obejct <- as(sf_object, "Spatial")
But, first I would like to see this issue closed of course :smile:
Hmm, I still get the error. Did you work in the 'distance-frome' branch? Can you tell which station was on the edge of the raster? I checked in GIS and they seem fairly close to the river or in the river.
And I indeed get an error for other shapefiles that sf
objects are invalid:
> study.area <- gUnion(rivers, nete)
Error in RGEOSUnaryPredFunc(spgeom, byid, "rgeos_isvalid") :
rgeos_convert_R2geos: invalid R class sf, unable to convert
But indeed, let's first solve the issue with the subscript out of bounds :satisfied:.
Yes, I forgot while cleaning everything to remove the line code with the bug. Pull and try again. Yes, I am working in distance-frome branch.
By the way, I didn't commit the outputs of prepare_datasets.R
, i.e. the tiff file and the csv with the distance among receivers. I leave it to you.
Alright, the code calculates the matrix for the River Frome 👍 ! I will try to run it for networks in other rivers the coming week and if there are no more errors apart from object classes, I will close the issue.
Just one small question: in the output, the distance between stations F2 and F3 is 23 m, while a rough measurement in QGIS gives 15 m (see figure below). Note that this is the distance along the straight line, which is the shapefile. The background itself is an online map.
When I increase the resolution from nrows <- 2000
and ncols <- 4000
to nrows <- 4000
and ncols <- 8000
, the distance even increases to 25 m.
Any idea why there is quite a difference between the QGIS calculation and the R calculation? And why increasing the resolution leads to a larger distance? This difference is not problematic, since the detection error of the receivers is much larger (ca. 300 m). It's just to be sure there is no bug or something in the code/calculation.
I tried to run the code for other receiver networks in other rivers, but encountered a problem. However, this problem may have nothing to do with this issue, so when the problem is more cleared out, we may close this one and open a new issue. I tried 2 networks (code has been added to the prepare_dataset.R
script; I will sent shapefiles via mail because they are too large to push): the network in the River Gudena (Denmark) and River Mondego (Portugal). For the River Gudena the code only calculates distances between some stations, while for the River Mondego I get an error:
> cst.dst.frame <- get.distance.matrix(study.area.binary.extended,
locations.receivers)
Error in costDist[] <- shortestPaths[index, index] :
replacement has length zero
In addition: Warning message:
In .cd2(x, fromCoords) :
Show Traceback
Rerun with Debug
Error in costDist[] <- shortestPaths[index, index] :
replacement has length zero
However, for both rivers, I had to create the line-shapefile myself. For the River Gudena, I added the last part of the river (the original shapefile did not run to the sea) and for the River Mondego I didn't have a proper shapefile, so created it myself. These generated shapefiles only have an 'ID' number. @damianooldoni any thoughts on how to proceed with this? The line-shapefiles from other rivers also need manual additions in GIS.
For better traceability I noticed these two anomalies in two different issues, #64 and #65. Getting back to the isue about distance for From river, indeed, distance should not diverge while increasing the resolution. I have an idea about this bug. Could you please check whether the distance diverge also for receivers of networks you already used in the past? Thanks. Meanwhile I try my solution on river Frome.
@damianooldoni currently I can only calculate distances between stations from the River Frome because of the above mentioned issues in the River Gudena and Mondego. When trying to calculate the distance between stations for the Belgian network, I also get an error, but will put that in another issue.
I have just now watched carefully your screenshot. You measured the distance along the shapefile, BUT as the shapefile is a line, in order to contain the receivers the raster is extended (study.area.binary
-> sutdy.area.binary.extended
) making the river "broader". So, in this case the calculated distance is more similar to the length of the hypotenuse than the length of the line you measured.
See screenshot before extension:
And after extension;
As I said some days ago by phone, I was considering to change strategy when rivers are lines instead of polygons, but I thought it was better as first step to solve this issue which was not depending on the type of spatial objects the river is made of (polygons or lines). Moreover, as we will combine polygons and lines at a certain point, it makes the "new" method prone to bugs. So to play safe, I will focus on existent code although it is less accurate with rivers as lines.
aha, this makes sense! So this issue can be closed, correct?
I would wait still a little. I find still worth a check that the distance grows so much while increasing the resolution. First i would like to close the new one, #66 so that you can check what I asked for here. If should be solved easily. I am working on it just now.
I ran some checks on distance calculations for the Belgian and Frome network. The variations are quite okay, especially for Belgium. Example of distances between stations in Belgian network:
s-2 - s-Wetteren:
nrows <- 2000
and ncols <- 4000
: 6558 kmnrows <- 4000
and ncols <- 8000
: 6546 kms-Vlassenbroek and s-4:
nrows <- 2000
and ncols <- 4000
: 1438 kmnrows <- 4000
and ncols <- 8000
: 1450 km@damianooldoni upon your approval, I will merge the branch and close this issue. Note that I will add documentation about the persisting error that can be ignored (https://stackoverflow.com/questions/61598340/why-does-rastertopoints-generate-an-error-on-first-call-but-not-second ). I will also put the receiver network files in a separate folder since I will start uploading various network files.
Nice! Yes, do it. It seems also that calculated distance converges to GIS manual measurement by increasing the resolution. Yes, makes separate folders, file management is important. And adding documentation about that error is a good reminder for us and anybody else running the code.
When trying to calculate the distances between detection stations in the River Frome (branch
distance-frome
), I get an error at line 107 in the scriptprepare_dataset
:I played around with the number of columns and rows to fit the file in the grid, but it didn't work. Any thoughts on this one?