ipeaGIT / geobr

Easy access to official spatial data sets of Brazil in R and Python
https://ipeagit.github.io/geobr/
802 stars 119 forks source link

Fixing package test coverage #68

Closed rafapereirabr closed 4 years ago

rafapereirabr commented 5 years ago

Oi Alan.

Eu abri um issue no pacote covr explicando o nosso problema e um dos autores deu uma sugestão de solução. Leia aqui. Veja por favor se voce pode implementar a sugestão deles para o geobr. Minha dica é você testar sugestão dele para apenas um script (test-read_amazon.R por exemplo) e daí vemos se dá certo.

rafapereirabr commented 5 years ago

Hi @pedro-andrade-inpe , do have an idea on how to solve this issue?

pedro-andrade-inpe commented 5 years ago

Hello @rafapereirabr. Some suggestions:

1) Whenever possible, add an argument link = FALSE to the functions that download data. When TRUE, the functions return only the link to where the data is, instead of downloading the data. CRAN and covr tests could only use link = TRUE and verify if the returned url is the correct one. 2) Implement a function download_data(link) (with possibly other arguments such as a filter) that downloads the data and returns a simple feature (is it always a simple feature?). This function could be used by all other functions that download data and might be tested only once with a small dataset. 3) One external test could check whether all rda data is ok, veryfing the number of lines and names of columns. 4) Some functions like read_census_tract() downloads all the data (which takes time) and then subsets to return only the selected municipality. If prep_data creates one file per municipality (plus one file with all the municipalities) then the dowload time would be much faster and testable for one municipality. 5) I don´t know if you are aware of, but it is possible to ignore some lines in covr (see https://rdrr.io/cran/covr/man/exclusions.html). It is not recommended to do this oftenly, but it is ok for lines that take too much time and you guarantee some way that they will not fail.

pedro-andrade-inpe commented 4 years ago

@rafapereirabr, in order to update your test coverage from a machine outside Travis or CRAN, just execute the three lines below:

Sys.setenv(NOT_CRAN = "true")
geobr_cov <- covr::package_coverage()
covr::codecov( coverage= geobr_cov, token ='xxx-xxxx-xxxx-xxxx-xxxx' )

The tests were skipped due to skip_on_cran(), which can be avoided by the system variable NOT_CRAN above. This is the current situation of geobr:

geobr Coverage: 94.93%
R/geobr.R: 0.00%
R/read_intermediate_region.R: 80.00%
R/read_state.R: 87.76%
R/read_immediate_region.R: 92.00%
R/lookup_muni.R: 93.94%
R/read_census_tract.R: 94.23%
R/read_weighting_area.R: 95.35%
R/read_meso_region.R: 97.44%
R/read_micro_region.R: 97.44%
R/read_municipality.R: 97.83%
R/download_metadata.R: 100.00%
R/read_amazon.R: 100.00%
R/read_biomes.R: 100.00%
R/read_conservation_units.R: 100.00%
R/read_country.R: 100.00%
R/read_disaster_risk_area.R: 100.00%
R/read_health_facilities.R: 100.00%
R/read_indigenous_land.R: 100.00%
R/read_region.R: 100.00%
R/read_semiarid.R: 100.00%
R/read_urban_area.R: 100.00%
rafapereirabr commented 4 years ago

Hi Pedro. It makes sense! Thanks. However, I've run the code you suggested (results below) but the covr shield in README.md didn't change. It still says 1%.

>   covr::codecov( coverage = geobr_cov, token =' **** - ***** - ****' )
$uploaded
[1] TRUE

$url
[1] "https://codecov.io/github/ipeaGIT/geobr/commit/f00cc85cfa6241396053bdf75015dbd60e9bd985"

$queued
[1] TRUE

$meta
$meta$status
[1] 200

$message
[1] "Coverage reports upload successfully"
rafapereirabr commented 4 years ago

Hi Pedro. For some reason, the covr shield now points to 10% after your last pull request. Progress :)

pedro-andrade-inpe commented 4 years ago

@rafapereirabr, did you execute the command to set NOT_CRAN before running covr?

Sys.setenv(NOT_CRAN = "true")

The current report in codecov's webpage indicates that only download_metadata.R, lookup_muni.R, and read_amazon.R have coverage greater than 0%. Such files do not have skip_on_cran().

rafapereirabr commented 4 years ago

Yes I did. This is code I ran but it didn't have any effect

# update Package coverage
  Sys.setenv(NOT_CRAN = "true")
  geobr_cov <- covr::package_coverage()
  covr::codecov( coverage = geobr_cov, token ='e3532778-1d8d-4605-a151-2a88593e1612' )

I keep skip_on_cran() on most test scripts because they take too much time and undermine CRAN's evaluation process.

pedro-andrade-inpe commented 4 years ago

@rafapereirabr, this is very strange. I have just created a codecov repository for geobr under my username (https://codecov.io/gh/pedro-andrade-inpe/geobr) and uploaded the latest covr report. It says that the total coverage is 67%, which is a little bit less than the output of package_coverage() in R. See the badge here.

pedro-andrade-inpe commented 4 years ago

Now with

Sys.setenv(NOT_CRAN = "true")

the result shown above is 94%.

rafapereirabr commented 4 years ago

Humm.... I just noticed a the difference between our codecov repos. In my repositry, it says that CI failed, while in yours it says CI passed. I couldn't find the reason it failed in my case. Any clues?

pedro-andrade-inpe commented 4 years ago

Well, let's see what happens when when codecov is updated after a new commit. Maybe codecov is a little bit lost with different updates on the coverage report without new commits.

rafapereirabr commented 4 years ago

You last pull request has solved this issue ! Thanks!