2DegreesInvesting / tiltPlot

Plots for the TILT project
https://2degreesinvesting.github.io/tiltPlot/
GNU General Public License v3.0
0 stars 0 forks source link

Better understand the system requirements of the sf package #69

Closed maurolepore closed 9 months ago

maurolepore commented 9 months ago

To help banks install titPlot we need to better understand the system requirements of the sf package.

The easiest way is how we did it ourselves, i.e. install Docker then run the rocker/geospatial image with something like this:

docker run -d -p 8787:8787 -e PASSWORD=<some password> rocker/geospatial

If they don't use Docker, then they would need to install each of the dependencies listed in this Dockerfile.

Most likely this will engage the IT folks and that may take time so it's best to start early and learn from the process.


Relates to a comment by Linda on Slack.

maurolepore commented 9 months ago

I did some experiments and I think the solution is to ensure the user has the pak package installed.

The pak package understands the system requirements listed in DESCRIPTION files, and automatically installs them. I think the easiest way to ensure users have pak is to ask them to install tiltPlot with pak via pak::pak("tiltPlot").

Here is my experiment: I install pak, then use it to install tiltPlot and all its package, and system dependencies recursively. Then I use the sf package without problem -- here for example I prove that I can see the source code of one function of the sf package.

> install.packages("pak")
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/pak_0.6.0.tar.gz'
Content type 'binary/octet-stream' length 7547024 bytes (7.2 MB)
==================================================
downloaded 7.2 MB

* installing *binary* package ‘pak’ ...
* DONE (pak)

The downloaded source packages are in
    ‘/tmp/RtmpJeXU8F/downloaded_packages’
> pak::pak("2DegreesInvesting/tiltPlot@test-sf")
! Using bundled GitHub PAT. Please add your own PAT using `gitcreds::gitcreds_set()`.
✔ Updated metadata database: 2.95 MB in 9 files.                               
✔ Updating metadata database ... done                                      

→ Will install 9 packages.
→ Will download 9 packages with unknown size.
+ classInt     0.4-10     [dl]
+ e1071        1.7-13     [dl]
+ ggalluvial   0.12.5     [dl]
+ proxy        0.4-27     [dl]
+ s2           1.1.4      [dl] + ✔ libssl-dev
+ sf           1.0-14     [dl] + ✖ libgdal-dev, ✖ gdal-bin, ✖ libgeos-dev, ✖ libproj-dev, ✔ libsqlite3-dev
+ tiltPlot     0.0.0.9001 [bld][cmp][dl] (GitHub: e810e6c)
+ units        0.8-4      [dl] + ✖ libudunits2-dev
+ wk           0.8.0      [dl]
→ Will install 5 system packages:
+ gdal-bin         - sf   
+ libgdal-dev      - sf   
+ libgeos-dev      - sf   
+ libproj-dev      - sf   
+ libudunits2-dev  - units
ℹ Getting 9 pkgs with unknown sizes
✔ Got e1071 1.7-13 (x86_64-pc-linux-gnu-ubuntu-22.04) (576.35 kB)                           
✔ Got proxy 0.4-27 (x86_64-pc-linux-gnu-ubuntu-22.04) (207.28 kB)                        
✔ Got classInt 0.4-10 (x86_64-pc-linux-gnu-ubuntu-22.04) (497.47 kB)                         
✔ Got ggalluvial 0.12.5 (x86_64-pc-linux-gnu-ubuntu-22.04) (1.66 MB)                         
✔ Got tiltPlot 0.0.0.9001 (source) (233.10 kB)                                               
✔ Got units 0.8-4 (x86_64-pc-linux-gnu-ubuntu-22.04) (352.57 kB)                            
✔ Got wk 0.8.0 (x86_64-pc-linux-gnu-ubuntu-22.04) (1.68 MB)     
✔ Got sf 1.0-14 (x86_64-pc-linux-gnu-ubuntu-22.04) (3.51 MB) 
✔ Got s2 1.1.4 (x86_64-pc-linux-gnu-ubuntu-22.04) (2.17 MB) 
✔ Downloaded 9 packages (10.88 MB)in 5.6s             
ℹ Installing system requirements
ℹ Executing `sudo sh -c apt-get -y update`
ℹ Executing `sudo sh -c apt-get -y install libgdal-dev gdal-bin libgeos-dev libproj-dev libudunits2-dev`
✔ Installed classInt 0.4-10  (121ms)                                                   
✔ Installed e1071 1.7-13  (158ms)                  
✔ Installed ggalluvial 0.12.5  (246ms)                                                
✔ Installed proxy 0.4-27  (316ms)                                                   
✔ Installed s2 1.1.4  (152ms)                                        
✔ Installed sf 1.0-14  (133ms)                                         
✔ Installed units 0.8-4  (118ms)                                       
✔ Installed wk 0.8.0  (128ms)                                        
ℹ Packaging tiltPlot 0.0.0.9001                                 
✔ Packaged tiltPlot 0.0.0.9001 (890ms)                              
ℹ Building tiltPlot 0.0.0.9001                                      
✔ Built tiltPlot 0.0.0.9001 (3.3s)                                  
✔ Installed tiltPlot 0.0.0.9001 (github::2DegreesInvesting/tiltPlot@e810e6c) (32ms)
✔ 1 pkg + 48 deps: kept 21, added 9, dld 9 (NA B) [53.2s]             
> library(fs)
> fs::dir_copy
function (path, new_path, overwrite = FALSE) 
{
    assert_no_missing(path)
    assert_no_missing(new_path)
    assert("`path` must be a directory", all(is_dir(path)))
    assert("Length of `path` must equal length of `new_path`", 
        length(path) == length(new_path))
    for (i in seq_along(path)) {
        if (!isTRUE(overwrite) && isTRUE(unname(is_dir(new_path[[i]])))) {
            new_path[[i]] <- path(new_path[[i]], path_file(path))
        }
        dir_create(new_path[[i]])
        dirs <- dir_ls(path[[i]], type = "directory", recurse = TRUE, 
            all = TRUE)
        dir_create(path(new_path[[i]], path_rel(dirs, path[[i]])))
        files <- dir_ls(path[[i]], recurse = TRUE, type = c("unknown", 
            "file", "FIFO", "socket", "character_device", "block_device"), 
            all = TRUE)
        file_copy(files, path(new_path[[i]], path_rel(files, 
            path[[i]])), overwrite = overwrite)
        links <- dir_ls(path[[i]], recurse = TRUE, type = "symlink", 
            all = TRUE)
        link_copy(links, path(new_path[[i]], path_rel(links, 
            path[[i]])), overwrite = overwrite)
    }
    invisible(path_tidy(new_path))
}
<bytecode: 0x559f39f2cbf0>
<environment: namespace:fs>
AnneSchoenauer commented 9 months ago

Hi @maurolepore, Good news: I can install the pak package and open it without any problems. However, when I type pak::pak("2DegreesInvesting/tiltPlot"), it tells me :

Using bundled GitHub PAT. Please add your own PAT using gitcred::gitcred_set(). Error: error in pak subprocess Caused by error: cannot install packages 2DegreesInvesting/tiltPlot: Cannot query GitHub are you offline?" And yes I am offline ;)

I did it with PACTA like this that I was downloading the ZIP-folder of the Repos and locally installed for example r2Dii.analysis package. Can I do this as well here?

Otherwise I am reaching out to the IT!

Best Anne

AnneSchoenauer commented 9 months ago

One other thing: I just typed now: install.packages("sf") and it says that it is installed. How do I know if it works?

maurolepore commented 9 months ago

Anne,

The challenging package is sf, so instead of testing it indirectly via tiltPlot for now I would test it directly.

Please try this:

install.packages("sf")
library(sf)

The second line is important. My experiments showed that sf installs easily but then errors when you use it with library().

If you get any error please share it.

Then try the same with pak.

install.packages("pak")
pak::pak("sf")
library(sf)
maurolepore commented 9 months ago

RE you're offline: You must be online. How could install.packages() have worked if you are offline? Packages come from CRAN which is an online resource.

I know you expect bank users to be offline, but the bank's IT must have access to CRAN somehow. Maybe they hold a copy of CRAN and keep it offline? Else they could not use any package at all, eg dplyr, or to R itself.

At some point they must install software from a validated online resource. Even if after installation they block online access.

AnneSchoenauer commented 9 months ago

With regard to the offline thing - so the Bundesbank saves packages locally and if I press install.packages I install it from this local folder. Apparently they included there the package "sf". I will now run the code that you suggested above

maurolepore commented 9 months ago

RE: with PACTA ... I was downloading the ZIP-folder of the Repos and locally installed ...

That should be necessary to solve this problem.

AnneSchoenauer commented 9 months ago

RE: library(sf)

When I press library(sf) I get a message "linking to GEOS 3.9.3, GDAL 3.5.2, PROJ 8.2.1; sf_use_s2() is TRUE" - so should be fine.

I also did install.packages("pak") pak::pak("sf") library(sf)

I don't get any error messages

maurolepore commented 9 months ago

RE: Bundesbank saves packages locally

Yeah, that's what I imagined. At some point their IT connects to internet and installs/saves software in their system so the analysts can later use it offline. So the experiments here would test the experience of the IT folks rather than the ultimate users of the system.

RE: Apparently they included there the package "sf".

Great, if they already have "sf" then it should all work.

maurolepore commented 9 months ago

Great, tnanks @AnneSchoenauer for being our road tester.

@lindadelacombaz, this comment suggests sf is no problem for @AnneSchoenauer and then likely no problem for the banks -- although the only way to know for sure is to ask them to test it.

Note that after Anne installed sf, running library(sf) throws a success message:

linking to GEOS 3.9.3, GDAL 3.5.2, PROJ 8.2.1; sf_use_s2() is TRUE

What I'm not sure is if that works before installing pak. (By the time Anne run library(sf) pak was already installed in her system). But anyway it's clear that at most they would have to install pak first and then let pak do its magic behind the hoods.

In conclusion, I believe we don't need to think of any nuclear option for now and assume it will all just work and then react after we have evidence that they experience any problem.