pik-piam / remind

Contains the REMIND-specific routines for data and model output manipulation.
3 stars 18 forks source link

First automated tests including a template for data integrity checks #117

Closed Loisel closed 3 years ago

Loisel commented 3 years ago

Integrate a first fully automated test for the call to convGDX2MIF. GDXs are shipped in the folder tests/testgdxs.

A first data integrity check is performed on the output, by specifying a list of the form

list(
        `FE|Transport|Liquids (EJ/yr)` = "`FE|Transport|Liquids|Oil (EJ/yr)` + `FE|Transport|Liquids|Biomass (EJ/yr)` + `FE|Transport|Liquids|Hydrogen (EJ/yr)` + `FE|Transport|Liquids|Coal (EJ/yr)`")

This is done by using data.table at the moment. For full RSE support, I'd opt for quitte::calc_addVariables instead, but there seem to be an issue with quitte::as.quitte and I could not successfully convert the magclass object.

This is intended as a first stab and comments & suggestions are welcome!

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 3 years ago

there seem to be an issue with quitte::as.quitte and I could not successfully convert the magclass object.

Care to elaborate, @Loisel ?

giannou commented 3 years ago

this will ship the GDX with the package, right? I was under the impression that we don't do that so that our packages are CRAN-compatible

Loisel commented 3 years ago

there seem to be an issue with quitte::as.quitte and I could not successfully convert the magclass object.

Care to elaborate, @Loisel ?

This is what I get when applying as.quitte to the output of convGDX2MIF:

> as.quitte(a)
Error in magclass::`getNames<-`(`*tmp*`, value = sub(" \\(([^\\()]*)\\)($|\\.)",  : 
  Inconsistent names! Number of dots per name has always to be the same as it is separating different data dimensions

I updated my packages but it seems to persist. Checking with data.tables there seem not to be dots in the variable names. I have not looked into it any further. Can be some local issue, if you have time, give it a try. I just want to get it running so I used what I know works. Feel free to rewrite in quitte + dplyr.

Loisel commented 3 years ago

this will ship the GDX with the package, right? I was under the impression that we don't do that so that our packages are CRAN-compatible

Up until now, old.gdx is already shipped with the package in inst/extdata. I just moved it to tests/testgdxs so that you do not have to install the package (which I normally do not do when developing). According to this, see 14.4, it seems to be ok to include (small) data with tests, it can be that the CRAN policy is more strict about it. Maybe it would work to add the files to Rbuildignore, but I could not test it, since the package DESCRIPTION does not comply with CRAN standards at the moment:

 * checking for file ‘./DESCRIPTION’ ... ERROR
Required fields missing or empty:
  ‘Author’ ‘Maintainer’
* DONE

Status: 1 ERROR
See
  ‘/home/alois/git/remind-lib/..Rcheck/00check.log’
for details.
* checking for file ‘./DESCRIPTION’ ... ERROR
Required fields missing or empty:
  ‘Author’ ‘Maintainer’
* DONE

Status: 1 ERROR
See
  ‘/home/alois/git/remind-lib/..Rcheck/00check.log’
for details.
giannou commented 3 years ago

this will ship the GDX with the package, right? I was under the impression that we don't do that so that our packages are CRAN-compatible

Up until now, old.gdx is already shipped with the package in inst/extdata. I just moved it to tests/testgdxs so that you do not have to install the package (which I normally do not do when developing). According to this, see 14.4, it seems to be ok to include (small) data with tests, it can be that the CRAN policy is more strict about it. Maybe it would work to add the files to Rbuildignore, but I could not test it, since the package DESCRIPTION does not comply with CRAN standards at the moment:

 * checking for file ‘./DESCRIPTION’ ... ERROR
Required fields missing or empty:
  ‘Author’ ‘Maintainer’
* DONE

Status: 1 ERROR
See
  ‘/home/alois/git/remind-lib/..Rcheck/00check.log’
for details.
* checking for file ‘./DESCRIPTION’ ... ERROR
Required fields missing or empty:
  ‘Author’ ‘Maintainer’
* DONE

Status: 1 ERROR
See
  ‘/home/alois/git/remind-lib/..Rcheck/00check.log’
for details.

the old.gdx file was included in .Rbuildignore so it was not shipped with the package when someone installed it from PIK-CRAN (download size was 377 KB). Now both GDX files will be shipped at download, no? The DESCRIPTION issue can be fixed quickly.

Loisel commented 3 years ago

R CMD check --as-cran works now apart from the unavoidable dependencies warning.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 3 years ago
> as.quitte(a)
Error in magclass::`getNames<-`(`*tmp*`, value = sub(" \\(([^\\()]*)\\)($|\\.)",  : 
  Inconsistent names! Number of dots per name has always to be the same as it is separating different data dimensions

The responsible party has been notified #118.

Loisel commented 3 years ago

Thanks for looking into it @0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q

giannou commented 3 years ago

No tests are performed when I run lucode2::buildlibrary(). Only way to deploy them is to run testthat::test_package("remind"). This is a bit weird, is that intended?

Loisel commented 3 years ago

Adding the files to .Rbuildignore unfortunately leads to them not being shipped for the test (which run in some temporary folder). My hunch would be to ship them with the package anyways. As far as I know, install.packages would not include the tests/ folder, so it should not make a big difference. Maybe CRAN would complain, but for the time being I think the benefits clearly outweigh the tradeoffs.

Loisel commented 3 years ago

@tscheypidi suggested to download the GDX on demand instead of including them with the repo, since it would bloat the repo size substantially if the binaries are updated once in a while. This is addressed in the most recent commit.

Loisel commented 3 years ago

Well this looks like one reliable solution … NOT!

Relying on random web services for package testing and deployment certainly never generated any problems at all

https://github.com/pik-piam/remind/blob/a4682799e958c3f50e26fa79b6eea9fbbe952e12/.buildlibrary#L8

I agree that this dependency is not optimal. I asked the IT if it would be possible to host these files, but I do not think they like the idea. On the other hand, if the web service stops working, tests can be easily deactivated.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 3 years ago

I asked the IT if it would be possible to host these files, but I do not think they like the idea.

:astonished:

http://www.pik-potsdam.de/~pehl/look_mum_no_external_hosting/ /home/pehl/www/look_mum_no_external_hosting

giannou commented 3 years ago

we could host them on the RSE server

Loisel commented 3 years ago

I asked the IT if it would be possible to host these files, but I do not think they like the idea.

astonished

http://www.pik-potsdam.de/~pehl/look_mum_no_external_hosting/ /home/pehl/www/look_mum_no_external_hosting

Nice. I also thought about hosting them on my homepage. I thought the CMS might not allow for random binaries to be uploaded. Good to know. I'll replace the links. I'll leave the MD5 checks in place, no offense :)

Loisel commented 3 years ago

we could host them on the RSE server

That would be even better, of course. How can we proceed to this end?

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 3 years ago

@Loisel

For full RSE support, I'd opt for quitte::calc_addVariables instead, but there seem to be an issue with quitte::as.quitte and I could not successfully convert the magclass object.

If this is only about checking sums, might I direct your attention towards quite::check_quitte()?

Loisel commented 3 years ago

@Loisel

For full RSE support, I'd opt for quitte::calc_addVariables instead, but there seem to be an issue with quitte::as.quitte and I could not successfully convert the magclass object.

If this is only about checking sums, might I direct your attention towards quite::check_quitte()?

This looks like a good match. Initially I thought that one might want to check arbitrary equations, but maybe it is indeed sufficient to only check for correct summation. Opinions @fschreyer or @Renato-Rodrigues ?