rspatial / terra

R package for spatial data handling https://rspatial.github.io/terra/reference/terra-package.html
GNU General Public License v3.0
534 stars 88 forks source link

I get two completely different raster values on two platforms #1439

Closed priyapatel1 closed 6 months ago

priyapatel1 commented 6 months ago

I'm using the exact same R script on my MacBook and a cloud computer on Digital Ocean. I use the exact same raster and the exact same script and get two different set of values. I'm truly confused about what's going on here and I would really appreciate some help. For both platforms I'm using terra_1.7-71

You can download the raster yourself here: https://dd.weather.gc.ca/model_raqdps-fw/10km/grib2/00/000/

Here's the script I use:

library(terra)

poll.name = "20240227T00Z_MSC_RAQDPS-FW_PM2.5_Sfc_RLatLon0.09_PT000H.grib2"

temp.ras = terra::rast(poll.name)

head(values(temp.ras))

On my computer I get the following:

> head(values(temp.ras))
     2[m] HTGL=Specified height level above ground; Mass Density (Concentration) [kg/(m^3)]
[1,]                                                                                6.5e-10
[2,]                                                                                6.5e-10
[3,]                                                                                6.5e-10
[4,]                                                                                6.5e-10
[5,]                                                                                6.5e-10
[6,]                                                                                6.5e-10

On Digital Ocean I get the following:

> head(values(temp.ras))
     2[m] HTGL=Specified height level above ground; Mass Density (Concentration) [kg/(m^3)]
[1,]                                                                              1.207e-09
[2,]                                                                              1.207e-09
[3,]                                                                              1.207e-09
[4,]                                                                              1.207e-09
[5,]                                                                              1.207e-09
[6,]                                                                              1.207e-09
> 

This is the Digital Ocean R version _
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 4
minor 3.2
year 2023
month 10
day 31
svn rev 85441
language R
version.string R version 4.3.2 (2023-10-31) nickname Eye Holes

This is my R Version:

version _
platform aarch64-apple-darwin20
arch aarch64
os darwin20
system aarch64, darwin20
status
major 4
minor 3.1
year 2023
month 06
day 16
svn rev 84548
language R
version.string R version 4.3.1 (2023-06-16) nickname Beagle Scouts

kadyb commented 6 months ago

Maybe you have different versions of GDAL on these systems and that is why there is the difference?

brownag commented 6 months ago

I think the difference has to do with floating point calculations on aarch64 (ARM/M series Mac) vs. x86_64 (Digital Ocean) . It is known that these types differences will occur on these different architectures; for example: https://stackoverflow.com/questions/64036879/differing-floating-point-calculation-results-between-x86-64-and-armv8-2-a

In general these are very small numbers. From a practical stand point I think they are probably functionally equivalent (i.e. very near zero) for most or all purposes (but perhaps not for your atmospheric case!). If you need precision in your floating point calculations out to the 9th/10th decimal place you may need to use a different approach, 64-bit "double precision" floating points, or something like that.

EDIT: I do not know enough about the internals offhand to saywhether there is an issue in GDAL or terra, but I think with these small amounts of mass you are at or near the limits of the precision that can be represented. Possibly working with a different unit (not kg/m^3) would get you more consistent results

priyapatel1 commented 6 months ago

This definitely isn't the issue. The mean, max, and min of the data are all identical. And although they are small, they are pre-baked, modelled numbers that require precision for my work.

On Wed, Feb 28, 2024 at 12:20 PM Andrew Gene Brown @.***> wrote:

I think the difference has to do with floating point calculations on aarch64 (ARM/M series Mac) vs. x86_64 (Digital Ocean) . It is known that these types differences will occur on these different architectures; for example: https://stackoverflow.com/questions/64036879/differing-floating-point-calculation-results-between-x86-64-and-armv8-2-a

In general these are very small numbers. From a practical stand point I think they are probably functionally equivalent (i.e. very near zero) for most or all purposes. If you need precision in your floating point calculations out to the 9th/10th decimal place you may need to use a different approach, 64-bit floating points, or something like that.

— Reply to this email directly, view it on GitHub https://github.com/rspatial/terra/issues/1439#issuecomment-1969476688, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMB6VMJIUZEBCMYONF76NVDYV5RPJAVCNFSM6AAAAABD4EO5UWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRZGQ3TMNRYHA . You are receiving this because you authored the thread.Message ID: @.***>

sigmafelix commented 6 months ago

I tried reading the GRIB2 file in two environments:

  1. Ubuntu 22.04.3 (x86_64), terra 1.7-71 (GDAL 3.6.4)
  2. Ubuntu 22.04.4 (x86_64), terra 1.7-71 (GDAL 3.8.3) Both got the similar results with what you've found in your laptop (aarch64, terra 1.7-71):
> head(grib1)
  2[-] RESERVED=Reserved; (prodType 0, cat 20, subcat 0) [-]
1                                                 6.5225e-10
2                                                 6.5225e-10
3                                                 6.5225e-10
4                                                 6.5225e-10
5                                                 6.4850e-10
6                                                 6.4850e-10

Related to #1427, I suggest you check md5sum of the file on both systems using --

# move to the file directory
cd yourdir_with_grib2
md5sum 20240227T00Z_MSC_RAQDPS-FW_PM2.5_Sfc_RLatLon0.09_PT000H.grib2

I could find md5sum values from both systems

81e48fd106c59f2966f08e8c295e932c 20240227T00Z_MSC_RAQDPS-FW_PM2.5_Sfc_RLatLon0.09_PT000H.grib2
brownag commented 6 months ago

This definitely isn't the issue.

I can confirm that I get the exact same results as @sigmafelix (and approximately what you seem to get locally) with x86_64 GDAL 3.8.3 (Windows and Manjaro Linux) and Ubuntu 22.04.3 aarch64 (raspi) GDAL 3.4.1. So not an architecture issue specifically, and also probably (?) not a GDAL version issue. Sorry for the goose chase there on my part.

I also wondered (as @sigmafelix points out) if there was some sort of issue with the file. How are you downloading the file on DigitalOcean?

You could directly access the source with GDAL using /vsicurl/ and see if you get the same results:

x <- rast("/vsicurl/https://dd.weather.gc.ca/model_raqdps-fw/10km/grib2/00/000/20240227T00Z_MSC_RAQDPS-FW_PM2.5_Sfc_RLatLon0.09_PT000H.grib2")
kadyb commented 6 months ago

I think @priyapatel1 should report (terra::gdal()) what version of GDAL is using to rule out the use of some old version like GDAL 2. However, Robert also suggested that the file may be corrupted.

priyapatel1 commented 6 months ago

This definitely isn't the issue.

I can confirm that I get the exact same results as @sigmafelix (and approximately what you seem to get locally) with x86_64 GDAL 3.8.3 (Windows and Manjaro Linux) and Ubuntu 22.04.3 aarch64 (raspi) GDAL 3.4.1. So not an architecture issue specifically, and also probably (?) not a GDAL version issue. Sorry for the goose chase there on my part.

I also wondered (as @sigmafelix points out) if there was some sort of issue with the file. How are you downloading the file on DigitalOcean?

You could directly access the source with GDAL using /vsicurl/ and see if you get the same results:

x <- rast("/vsicurl/https://dd.weather.gc.ca/model_raqdps-fw/10km/grib2/00/000/20240227T00Z_MSC_RAQDPS-FW_PM2.5_Sfc_RLatLon0.09_PT000H.grib2")

This is the method I'm currently using. Doesn't change the results.

priyapatel1 commented 6 months ago

I think @priyapatel1 should report (terra::gdal()) what version of GDAL is using to rule out the use of some old version like GDAL 2. However, Robert also suggested that the file may be corrupted.

On my computer (the one that reports correct results, the terra version is "3.5.3", on the Ubuntu terminal, it's "3.0.4".

sigmafelix commented 6 months ago

How about md5sum values @priyapatel1 ?

priyapatel1 commented 6 months ago

How about md5sum values @priyapatel1 ?

@sigmafelix how do I check this if I'm importing the file without a local download? I'm currently using this method:

x <- rast("/vsicurl/https://dd.weather.gc.ca/model_raqdps-fw/10km/grib2/00/000/20240229T00Z_MSC_RAQDPS-FW_PM2.5_Sfc_RLatLon0.09_PT000H.grib2")

NOTE: The files in this repository change daily (it's weather data) so the URL may need to be updated if you find that it's not working for you.

priyapatel1 commented 6 months ago

I tried updating the gdal to match what I have on my MacBook, and I got this warning (and the update didn't work):

install.packages("gdal")

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
Warning message:
package ‘gdal’ is not available for this version of R

A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages 
brownag commented 6 months ago

I tried updating the gdal to match what I have on my MacBook, and I got this warning (and the update didn't work):

install.packages("gdal")
...

"gdal" is not an R package on CRAN. You can install from regular command line. Add the ubuntugis PPA, update apt repositories, then install gdal-bin and libgdal-dev.

Depending on the version of Ubuntu you are running the PPA will give you a different version of GDAL. I am guessing you are on Ubuntu 20.04 (focal) or older. Not sure if DigitalOcean still has 18.04 (bionic) droplets? I think you probably can't create a droplet with 18.04 at least because it is past end of life. It might be good if you could confirm Ubuntu version too.

GDAL 3.0.4 is the latest in ubuntugis-unstable for 18.04. The latest version you can get on a 20.04 with ubuntugis is 3.3.2, ubuntugis-unstable is 3.4.3. If you upgrade to 22.04 (probably the best idea) then you can get 3.6.4 (stable) or 3.8.3 (unstable)

PPA lists https://launchpad.net/~ubuntugis/+archive/ubuntu/ubuntugis https://launchpad.net/~ubuntugis/+archive/ubuntu/ubuntugis-unstable

Something like:

sudo add-apt-repository ppa:ubuntugis/ppa
sudo apt update
sudo apt install gdal-bin libgdal-dev

Hopefully you do not have any conflicts; note that it is possible that there are some other libraries that are linked to your current GDAL version. After updating GDAL you need to reinstall/recompile terra and any other GDAL-dependent libraries.

priyapatel1 commented 6 months ago

Okay, I reinstalled GDAL without any issue. I tried to reinstall terra. It didn't work because of PROJ (I think). I think this may be the issue.

This is where the terra install failed:

checking GDAL: /usr/share/gdal/pcs.csv readable... no
checking GDAL: checking whether PROJ is available for linking:... yes
checking GDAL: checking whether PROJ is available for running:... free(): invalid pointer
./configure: line 3651: 529395 Aborted                 ./gdal_proj
no
configure: error: OGRCoordinateTransformation() does not return a coord.trans: PROJ not available?
ERROR: configuration failed for package ‘terra’
* removing ‘/usr/local/lib/R/site-library/terra’
* restoring previous ‘/usr/local/lib/R/site-library/terra’
* installing *source* package ‘viridis’ ...
** package ‘viridis’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
*** copying figures
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (viridis)
* installing *source* package ‘leaflet’ ...
** package ‘leaflet’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (leaflet)

The downloaded source packages are in
    ‘/tmp/RtmpVhuVtC/downloaded_packages’
Warning message:
In install.packages("terra", dependencies = TRUE) :
  installation of package ‘terra’ had non-zero exit status
priyapatel1 commented 6 months ago

Current GDAL version.

GDAL 3.3.2, released 2021/09/01
free(): invalid pointer
Aborted
brownag commented 6 months ago

You probably will also need to update PROJ, also available for upgrade after adding the ubuntugis PPA:

sudo apt install proj-bin libproj-dev
priyapatel1 commented 6 months ago

You probably will also need to update PROJ, also available for upgrade after adding the ubuntugis PPA:

sudo apt install proj-bin libproj-dev

Okay, updated PROJ without issue, and tried to re-install terra:

> install.packages("terra",dependencies = TRUE)

checking GDAL: checking whether PROJ is available for linking:... yes
checking GDAL: checking whether PROJ is available for running:... free(): invalid pointer
./configure: line 3651: 530311 Aborted                 ./gdal_proj
no
configure: error: OGRCoordinateTransformation() does not return a coord.trans: PROJ not available?
ERROR: configuration failed for package ‘terra’
* removing ‘/usr/local/lib/R/site-library/terra’
* restoring previous ‘/usr/local/lib/R/site-library/terra’

The downloaded source packages are in
    ‘/tmp/RtmprV9oTf/downloaded_packages’
Warning message:
In install.packages("terra", dependencies = TRUE) :
  installation of package ‘terra’ had non-zero exit status

Same issue. I don't understand why it's saying PROJ not available.

brownag commented 6 months ago

Try running

sudo ldconfig

Also, check this output

sudo ldconfig -p | grep libproj

Sometimes you just need to update symbolic links. There are some other weird things that can happen with multiple installations... If you have a nonstandard or want a specific version of proj you can install with

install.packages("terra", configure.args = "--with-proj-lib=/path/to/proj/include")

Also, for completeness, you probably also want to update GEOS and a couple other libs the latest version you can

From https://github.com/r-spatial/sf/#linux:


sudo apt-get install libudunits2-dev libgdal-dev libgeos-dev libproj-dev libsqlite0-dev
priyapatel1 commented 6 months ago

Try running

sudo ldconfig

Also, check this output

sudo ldconfig -p | grep libproj

Sometimes you just need to update symbolic links. There are some other weird things that can happen with multiple installations... If you have a nonstandard or want a specific version of proj you can install with

install.packages("terra", configure.args = "--with-proj-lib=/path/to/proj/include")

Also, for completeness, you probably also want to update GEOS and a couple other libs the latest version you can

From https://github.com/r-spatial/sf/#linux:

sudo apt-get install libudunits2-dev libgdal-dev libgeos-dev libproj-dev libsqlite0-dev

This is the output of the sudo ldconfig -p | grep libproj

    libproj.so.19 (libc6,x86-64) => /lib/x86_64-linux-gnu/libproj.so.19
    libproj.so.15 (libc6,x86-64) => /lib/x86_64-linux-gnu/libproj.so.15
    libproj.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libproj.so

Does that mean I have more than one version installed?

I tried reinstalling terra and get the same error.

configure: error: OGRCoordinateTransformation() does not return a coord.trans: PROJ not available?
ERROR: configuration failed for package ‘terra’
* removing ‘/usr/local/lib/R/site-library/terra’
* restoring previous ‘/usr/local/lib/R/site-library/terra’

I'm wondering if I should uninstall all GDAL and PROJ installs and reinstall them - thoughts?

sigmafelix commented 6 months ago

@priyapatel1

If you have sufficient privilege in your Digital Ocean virtual machine, I suggest you use rocker-geospatial Docker image to use a static version of dependencies. Docker and Apptainer will work to build an image from docker images listed in the link.

priyapatel1 commented 6 months ago

What if I delete older versions of PROJ and GDAL?

root@smoker:~# dpkg -l | grep -i proj
ii  libdap25:amd64                   3.20.5-1                          amd64        Open-source Project for a Network Data Access Protocol library
ii  libproj-dev:amd64                8.2.0-1~focal0                    amd64        Cartographic projection library (development files)
ii  libproj15:amd64                  6.3.1-1                           amd64        Cartographic projection library
ii  libproj19:amd64                  7.2.1-1~focal0                    amd64        Cartographic projection library
ii  libproj22:amd64                  8.2.0-1~focal0                    amd64        Cartographic projection library
ii  proj-bin                         7.2.1-1~focal0                    amd64        Cartographic projection library (tools)
ii  proj-data                        8.2.0-1~focal0                    all          Cartographic projection filter and library (datum package)
ii  runc                             1.1.7-0ubuntu1~20.04.2            amd64        Open Container Project - runtime
root@smoker:~# dpkg -l | grep -i gdal
ii  gdal-bin                         3.3.2+dfsg-2~focal2               amd64        Geospatial Data Abstraction Library - Utility programs
ii  gdal-data                        3.4.3+dfsg-1~focal0               all          Geospatial Data Abstraction Library - Data files
ii  libgdal-dev                      3.4.3+dfsg-1~focal0               amd64        Geospatial Data Abstraction Library - Development files
ii  libgdal26                        3.0.4+dfsg-1build3                amd64        Geospatial Data Abstraction Library
ii  libgdal29                        3.3.2+dfsg-2~focal2               amd64        Geospatial Data Abstraction Library
ii  libgdal30                        3.4.3+dfsg-1~focal0               amd64        Geospatial Data Abstraction Library
ii  python3-gdal                     3.3.2+dfsg-2~focal2               amd64        Python 3 bindings to the Geospatial Data Abstraction Library 
brownag commented 6 months ago
  libproj.so.19 (libc6,x86-64) => /lib/x86_64-linux-gnu/libproj.so.19
  libproj.so.15 (libc6,x86-64) => /lib/x86_64-linux-gnu/libproj.so.15
  libproj.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libproj.so

Does that mean I have more than one version installed?

@priyapatel1 Yes, this indicates you at least had 6.3.1 and 7.2.1 installed.

Multiple versions might not specifically be the problem--but wouldn't hurt (considering it is not working right now) to try purging/reinstalling. Sorry, is hard to say for sure what the issue is. I am guessing it is the libproj15 in there that is part of the issue, plus that not matching up with libproj-dev version and possibly the versions of GDAL/what PROJ they were installed with.

sudo apt purge libgdal* libproj* libgeos* gdal* proj*
priyapatel1 commented 6 months ago

Okay so I removed everything via sudo apt purge libgdal* libproj* libgeos* gdal* proj*

Then I reinstalled both and then reinstalled terra and now the values match!!!

Thank you so much for your help!