ipeaGIT / geobr

Easy access to official spatial data sets of Brazil in R and Python
https://ipeagit.github.io/geobr/
786 stars 118 forks source link

geobr archived on CRAN #272

Closed rafapereirabr closed 2 years ago

rafapereirabr commented 2 years ago

Our data serves have been particularly unstable these weeks because we are moving to a new building. My apologies for the inconvenience. So it happend again. This is the fourth time we get this warning from the CRAN team, so they've decided to archive the geobr package.

'Packages which use Internet resources should fail gracefully with an informative message
if the resource is not available or has changed (and not give a check warning nor error).'

We are currently trying to fix this asap. Any help would be much welcomed.

short term solution

long term solution

rafapereirabr commented 2 years ago

geobr has been resubmitted to CRAN. See CRAN incoming dashboard.

MatthieuStigler commented 2 years ago

thanks for re-submitting, geobr is very useful!

Have you been thinking about providing the metadata_gpkg.csv file as part of the package, maybe with a update_metadata function?

rafapereirabr commented 2 years ago

thanks for re-submitting, geobr is very useful!

Have you been thinking about providing the metadata_gpkg.csv file as part of the package, maybe with a update_metadata function?

Hi @MatthieuStigler . Thanks for the support. Regarding your question, what exactly do you mean?

MatthieuStigler commented 2 years ago

well at that moment, I was thinking it would be good to store within the package the metadata_gpkg.csv file, so that one could access it even if the server is down.

But in retrospect, this doesn't make much sense: since the package goal is to download data, having access just to the metadata when the server is down won't help much.

MatthieuStigler commented 2 years ago

Hi @rafapereirabr , I see the package got archived again?

Is the problem arising due to examples, or from the vignette?

If it's from the vignette, have you considered providing simply the ocmpiled md document, indetad of the rmd? it doesn't sound trivial, but yet seems doable: https://stackoverflow.com/questions/52942232/creating-vignette-from-md-file-not-rmd

Thanks!

rafapereirabr commented 2 years ago

Hi @MatthieuStigler .

geobr has been off CRAN since 23 Jan. I have fixed the problems and submitted a new version v1.6.6 to CRAN on Feb 4th. They have ignored my submission. I've sent a couple messages to CRAN maintainers (Briand and Uwe) but I haven't heard back from them. I paste below the emails I have sent them make this a public record.

Email sent to Brian on Jan 26th and to Uwe on Feb 4th. This message was included in the CRAN submission of geobr v1.6.6 on Jan 26th

Dear Brian,

Please accept my apologies. I've been struggling to overcome this problem and I'm sorry for the hassle caused. After the last warning, I included in the package a support function to check for any issues with internet connection (both from the user or from our server side). The function is working fine, but I did not realize that it was NOT halting parent functions. When there was an internet issue, the connection test was correctly throwing an informative message but the effect of halting functions was not propagating. I have now gone over every function of the package including checks to test whether files have been successfully downloaded whenever there is a download. This has made the package slower, but it solves the problem accordingly. Now I've done several tests checking the behavior of every function in our package under different scenarios (user with no connection, our server offline, our server timing out), and the package now behaves (and fails) as required by CRAN policies.

I hope it is Ok for us to resubmit the geobr package to CRAN. The package has a large base of R users (over 50K downloads with aprox. 2K downloads per month) and it is extremely important to keep supporting them with open data. CRAN feedback has been very helpful to improve the package and I do very much appreciate your support.

best wishes,

Rafael H. M. Pereira

Email sent to Brian and to Uwe on Mar 28th.

Dear Brian and Uwe,

I hope this email finds you well. I writing to you to ask for another chance for the geobr package on CRAN.

As I have explained in a previous email a couple months ago (see message below), the problem with geobr was that, when there was an internet connection issue, the connection test was correctly throwing an informative message but the effect of halting functions was not propagating to parent functions called by users. I have now gone over every function of the package and included checks to make sure the package fails gracefully as per CRAN policies.

The geobr package has a large base of R users (over 50K downloads with aprox. 2K downloads per month). These numbers reflect the relevance of geobr to support the R and spatial data communities worldwide and particularly in Brazil where we have a vibrant and growing community of R users. I would like to kindly ask you to consider the resubmission of geobr to CRAN.

Kind regards,

Rafael Pereira

MatthieuStigler commented 2 years ago

I understand this can be very frustrating! CRAN maintainers also have a ton of issues to deal with.

Maybe you can just try to resubmit the package? If it passes the tests without any problems (does it pass them even with failing connexion?), it might just go straight to CRAN?

rafapereirabr commented 2 years ago

I have resubmitted geobr to CRAN on on Feb 4th. It passes all tests. However, the package is scrutinized as a 1s submission . Besides, CRAN maintainers keep a log of the submission history of all packages, so they know that geobr has been suspended and, so it seems, choose to ignore it. I'm trying to revert this situation asap.

It is really important to highlight that geobr is working fine for both R (and Python) users. Anyone can install geobr in R from github and use it. devtools::install_github("ipeaGIT/geobr", subdir = "r-package")

Being on CRAN would be nice because it makes it easier for users to install geobr, but that's it.

rodrigolustosa commented 2 years ago

Would it help to "flood" their inbox with requests from geobr users (us)? Maybe with a template e-mail (that everyone could just copy and paste from here) explaining the package importance and that the problem has already been fixed (and other important information).

MatthieuStigler commented 2 years ago

Hi Rafael, I am sorry to hear that about CRAN, I am sure it is frustrating. Now I don't think that flooding the email would be good by any mean, CRAN maintainers do a great volunteer job with a ton of packages, that could be quite counter-productive!

If you are to ask again CRAN, it would be crucial to make absolutely sure the package works, under any circumstances (i.e. assuming the server is down). I just ran a test, and got a very loooong test time, and several errors, see below. more importantly though, did you make sure the package/tests/examples would work even if the server is down?

══ Failed tests ════════════════════════════════════════════════════════════════ ── Failure (test-read_indigenous_land.R:16:3): read_indigenous_land ──────────── is(test_sf, "sf") is not TRUE

actual: FALSE expected: TRUE ── Failure (test-read_indigenous_land.R:19:3): read_indigenous_land ──────────── test_sf$code_terrai %>% length() not equal to 615. 1/1 mismatches [1] 0 - 615 == -615 ── Failure (test-read_intermediate_region.R:12:3): read_intermediate_region ──── is(read_intermediate_region(), "sf") is not TRUE

actual: FALSE expected: TRUE

[ FAIL 3 | WARN 0 | SKIP 0 | PASS 208 ] Error: Test failures

MatthieuStigler commented 2 years ago

oh, I just see you removed these tests from CRAN checks!

I tried with

devtools::check_win_devel()
devtools::check_win_release()
devtools::check_win_oldrelease()

and got all clean results, nice job! Was I just lucky, or you the tests/examples are guaranteed to work now even if the server is down?

rafapereirabr commented 2 years ago

Hi @MatthieuStigler , geobr has many functions and tests , which can take quite some time. This is why I've chosen to run all tests and checks locally and on github action.

geobr currently downloads the data from our Ipea server. So the server needs to be online for geobr to work properly. Our server had a few hiccups a few months ago, but it is up and running well. It has been stable for the past one or two months now.

rafapereirabr commented 2 years ago

Package geobr v1.7.0 submitted to CRAN with the following comment:

-- R CMD check results -------------------------------------- geobr 1.6.599909 ---- Duration: 3m 49.4s

checking data for non-ASCII characters ... NOTE Note: found 58 marked UTF-8 strings

0 errors v | 0 warnings v | 1 note x

  • This is a submission to get the geobr package back on CRAN.

The geobr package was suspended on CRAN on January 2022 because it continuously failed CRAN's policy to "fail gracefully" when there are any internet connection problems.

We have scrutinized the package, which has now gone through structural changes to address this issue. Here are the main changes:

  1. New internal function check_connection() and tests that cover cases when users have no internet connection, whem url links are offline, time out or work normally.
  2. All functions that require internet connection now use check_connection() and return informative messages when url links are offline or timeout.
  3. The data used in the package is now simultaneously stored in two independent servers, where one of them is used as a backup link. In other words, the geobr will download the data from server 1. If, for some reason, the download fails because of internet connection problems, then geobr tries to download the data from server 2. If this second attempt fails, then the package returns invisible(NULL) with an informative message.

We believe these changes and the redundancy in data storage have made the geobr package substantially more robust and in line with CRAN's policies.

rafapereirabr commented 2 years ago

We are back on CRAN with geobr v1.7.0

MatthieuStigler commented 2 years ago

great, thanks for the awesome work!!!