SITS: Satellite Image Time Series Analysis for Earth Observation Data Cubes (submit R package for review) #596

Open gilbertocamara opened 1 year ago

gilbertocamara commented 1 year ago

Submitting Author Name: Gilberto Camara Submitting Author Github Handle: @gilbertocamara Other Package Authors Github handles: @rolfsimoes, @OldLipe, @pedro-andrade-inpe Repository: Version submitted: 1.4.2 Submission type: Standard Editor: @mpadge Reviewers: @mikemahoney218, @paleolimbot

Due date for @mikemahoney218: 2024-08-07 Due date for @paleolimbot: 2024-08-07

Archive: TBD Version accepted: TBD Language: en


(a) Vignettes: instead of preparing vignettes, the authors have written an on-line book that describes the contents of the package in detail. The book is available at the URL

Important notes:

(1) To run the tests, examples, and code coverage, please make sure the following environment variables are set in the R session: Sys.setenv("SITS_RUN_TESTS" = "YES") Sys.setenv("SITS_RUN_EXAMPLES" = "YES") sits is a fairly large package, and the tests take a long time to run, since they access cloud services. For this reason, testing needs to be manually enabled.

(2) Please review version 1.4.2, not yet on CRAN, which is available in the "dev" branch in the github repository.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

MEE Options - [x] The package is novel and will be of interest to the broad readership of the journal. - [x] The manuscript describing the package is no longer than 3000 words. - [x] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code]( - (*Scope: Do consider MEE's [Aims and Scope]( for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon.

Editor check started


Checks for sits (v1.4.1)

git hash: 6eac9edf

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL-2

1. Package Dependencies

**NOTE:** Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:
- code in C++ (5% in 14 files) and R (95% in 131 files)
- 8 authors
- no vignette
- 4 internal data files
- 18 imported packages
- 164 exported functions (median 11 lines of code)
- 1988 non-exported functions in R (median 7 lines of code)
- 73 R functions (median 11 lines of code)

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges
[![R-CMD-check.yaml](](

**GitHub Workflow Results**

| id|name |conclusion |sha | run_number|date |
|----------:|:-----------|:----------|:------|----------:|:----------|
| 5438924419|R-CMD-check |success |bc5d6c | 116|2023-07-02 |

---

#### 3b. `goodpractice` results

#### `R CMD check` with [rcmdcheck](

R CMD check generated the following note:

1. checking installed package size ... NOTE
   installed size is 16.7Mb
   sub-directories of 1Mb or more:
     libs  14.1Mb

R CMD check generated the following check_fail:

1. rcmdcheck_reasonable_installed_size

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

gilbertocamara commented 1 year ago

Many thanks for your response. Please see below the following explanation, which was included as an "Important Note" in the submission, but maybe it has failed to catch the attention of the reviewers.

✖️ Package coverage is 0.1% (should be at least 75%). Package coverage is actually 95%. Please see

sits is a large package. There are more than 1,100 individual tests that take a long time to run. Some of these tests access cloud services, which might be temporarily offline. For this reason, testing needs to be manually enabled. To run the tests, examples, and code coverage, please set the following environment variables in the R session:

Sys.setenv("SITS_RUN_TESTS" = "YES")
Sys.setenv("SITS_RUN_EXAMPLES" = "YES")

We are confident that sits meets the required criteria for ROpenSci review.

We would also like to respond to the lintr message:

Avoid library() and require() calls in packages - 16 

The package imports directly 17 packages, which are required for most functions. It also suggests 33 packages, which are typically used only in a few function, and need to be included only in "as-is" basis. This is based on CRAN policies that restrict the number of imported packages.

maelle commented 12 months ago

Thank you for your submission @gilbertocamara! As well as your careful response to the automatic checks. We agree with all your responses above.

However, as this a package that implements statistical and ML methods of geospatial data, rather than just the “accessing, manipulating, converting” and converting in our scope, it falls under under our newer statistical peer review program, which has its own time series and [geospatial standards]( Submission requirements are different for this as authors need to document their standards compliance with our code annotation system.

A note - SITS is a package that is very large in scope and code base, as exemplified by the fact that it has a whole book for its documentation. As such, we anticipate that it will be challenging to find reviewers and we will need to give them considerably longer than usual to review the code base and documentation in full. Most of our submissions are not as large or mature at the point of review and up for significant API or architecture changes in response to review. For something in an earlier stage we would likely have suggested breaking functionality up into smaller, more focused packages. Nonetheless, we are up for the challenge if you are up for the higher statistical submission requirements and potential changes.

One last note regarding check results: As of now we can't set any environment variables when running the checks automatically, so they'd have to be set on your side, maybe using the withr::local_envvar() function or similar.

Thanks again! We're happy to answer further questions.

gilbertocamara commented 12 months ago

Dear @maelle, many thanks for your response. Please see my comments below:

Nonetheless, we are up for the challenge if you are up for the higher statistical submission requirements and potential changes.

Good! Looking at the specific requirements for ROpenSci statistical packages, the sits package meets most of them, such as G.2 (related to data input), G.3 (algorithms), G.4 (output data). We will have to review the sits package carefully as for requirements G.1 (documentation), G.5 (testing) and those for machine learning.

At a first glance, sits complies with requirements SP (spatial software) and TS (time series) and UL (unsupervised learning) . Since these requirements are very detailed, we will carefully review them to ensure compliance. We believe we meet the PD reqs (probability distr).

One last note regarding check results: As of now we can't set any environment variables when running the checks automatically, so they'd have to be set on your side, maybe using the [withr::local_envvar()]( function or similar.

Allow me to propose an alternative: please consider that the information provided in "" to be sufficient to assert that sits meets the code coverage requirements of ROpenSci. If you accept this proposal, it will save us both time and work.

We will work on improving sits so that it meets the specifications for ROpenSci statistical packages. We will report back to you when we have a new version that fully meets such specs.

Thanks, Gilberto

maelle commented 12 months ago

Thank you! :tada: Here's a direct link to the author guide for stat submissions:

maelle commented 12 months ago

@gilbertocamara just a clarification: your package will have to comply with one of the category standards of the statistical review system, probably spatial (because time series is for class-based manipulation of time-series data, which is not what your package does as far as I understand).

Probability distributions should be considered an "additional' category that may be complied with in addition to the main categories.

Thank you!

gilbertocamara commented 12 months ago

Dear @maelle, thanks for the clarification. As part of the pre-submission process of the package "sits" to the ROpenSci statistical review, as recommended by you, I tried to run autotest_package . I made sure that all the examples and tests run OK. However, when running autotest_package, I am finding an error, which I reported in
maelle commented 11 months ago

@gilbertocamara thanks for the report. I installed the dependencies of the dev branch, and I'm working with the dev branch itself. When running the examples I got

Error: No BDC_ACCESS_KEY defined in environment

Is there a comprehensive list of credentials needed to run the examples and tests? If there's such a list and one can create free accounts easily, I can get the credentials. If not, could these examples be skipped? For examples, see the @examplesIf tag I see tests are skipped, but a list of steps necessary to run all tests could be useful (even if, for instance, a reviewer might not create an AWS account).

A good place to add setup info would be the installation instructions, and/or the file .github/

Thank you!

maelle commented 11 months ago

R CMD check output attached, I'm getting a test failure as well as some warnings. What can have caused this failure?

I'd also recommend tackling the deprecation warnings from the tests before a full submission.

gilbertocamara commented 11 months ago

Dear @maelle, many thanks. One of the problems with the examples is that they assume the user has obtained a key to access the Brazil Data Cube (BDC). I have fixed the problem and removed this restriction. I ran R CMD CHECK with these changes, and now everything should work for you. Please use the new "dev" version that I have just uploaded.

gilbertocamara commented 11 months ago

Dear @maelle, one of the main challenges of the sits package is to provide a single point of access to different cloud services of Earth observation data, including AWS, Microsoft, NASA, and Digital Earth Africa, to name a few. Each provider requires different kinds of access credentials. In the on-line book, Chapter 4, we explain how to access each services. We recognized that information needs to be put somewhere else. We will improve the "Setup" chapter in the on-line book, an also prepare a new Thanks again!

maelle commented 11 months ago

@gilbertocamara the CONTRIBUTING guide could link to the book chapter, as long as it's easy to find all information.

I'm still waiting for R CMD check to finish, but examples ran without error (tests now running).

gilbertocamara commented 11 months ago

Thanks for the tips!

maelle commented 11 months ago

Tests passed! Now on to trying autotest...

gilbertocamara commented 11 months ago

Dear @maelle, autotest runs OK in "sits". Now, we have to work on the recommendations. Thanks!

gilbertocamara commented 11 months ago

Dear @maelle @mpadge I would like to ask for your help to understand how autotest works. As I understand it, autotest run different diagnostics on the functions of a package. It aims to test the resilience of the function to unexpected values of the parameters, for example NA values. It also tries to guess the parameter type from the Rd documentation; here, it tests the function for invalid entries, e.g, numeric inputs for integer parameters. That's important and valuable for software designers.

In the sits package, the authors have been very careful to include pre-conditions for all parameters of all functions. All parameters are checked for valid values, and an error message is provided. However, we are finding there is a mismatch between the error messages provided by sits and those expected by autotest. For us, it is not clear what autotest considers as a valid response.

Consider the following function, which takes as input a set of spatially referenced time series and allows the user to select some of its members. Users can either select a number or a fraction of the series. The relevant part of the code is shown below:

#' @title Sample a percentage of a time series
#' @name sits_sample
#' @author Rolf Simoes, \email{}
#' @description Takes a sits tibble with different labels and
#'              returns a new tibble. For a given field as a group criterion,
#'              this new tibble contains a given number or percentage
#'              of the total number of samples per group.
#'              Parameter n: number of random samples.
#'              Parameter frac: a fraction of random samples.
#'              If n is greater than the number of samples for a given label,
#'              that label will be sampled with replacement. Also,
#'              if frac > 1 , all sampling will be done with replacement.
#' @param  data       Sits time series.
#' @param  n          Integer: number of samples to select (range: 1 to nrow(data)).
#' @param  frac       Percentage of samples to pick from each group of data.
#' @param  oversample Oversample classes with small number of samples?
#' @return            A sits tibble with a fixed quantity of samples.
#' @examples
#' # Retrieve a set of time series with 2 classes
#' data(cerrado_2classes)
#' # Print the labels of the resulting tibble
#' summary(cerrado_2classes)
#' # Samples the data set
#' data_100 <- sits_sample(cerrado_2classes, n = 100)
#' # Print the labels
#' summary(data_100)
#' # Sample by fraction
#' data_02 <- sits_sample(cerrado_2classes, frac = 0.2)
#' # Print the labels
#' summary(data_02)
#' @export
sits_sample <- function(data,
                        n = NULL,
                        frac = NULL,
                        oversample = TRUE) {
    # set caller to show in errors
    # verify if data is valid
    # verify if either n or frac is informed
        x = !(purrr::is_null(n) & purrr::is_null(frac)),
        local_msg = "neither 'n' or 'frac' parameters were informed",
        msg = "invalid sample parameters"
    # check oversample
    # check n and frac parameters
    if (!purrr::is_null(n))
        .check_num(n, allow_na = FALSE, is_integer = TRUE,
                   min = 1, max = nrow(data),
                   len_min = 1, len_max = 1,
                   msg = "invalid n parameter")
    if (!purrr::is_null(frac))
        .check_num(frac, allow_na = FALSE, is_integer = FALSE,
                   min = 0.0, max = 10.0,
                   len_min = 1, len_max = 1,
                   msg = "invalid frac parameter")

The output for autotest for this function is:

  type  test_name fn_name     parameter parameter_type operation content               test  
1 error NA        sits_sample NA        NA             NA        sits_sample: invalid… TRUE 

In the above the content column is:

sits_sample: invalid n parameter (value is not integer)

We are failing to understand what is being tested by autotest and what is the expected response. As you can see from the code above, we explicitly test for NA and test for the valid values of the input parameters. In principle, we cannot find flaws in the error messages we provide. Please see some examples below.

> Error: sits_sample: invalid 'x' parameter (NA value is not allowed)

sits_sample(cerrado_2classes, n = NA)
> Error: sits_sample: invalid 'x' parameter (NA value is not allowed)

sits_sample(cerrado_2classes, n = 0.3)
> Error: sits_sample: invalid n parameter (value is not integer)

sits_sample(cerrado_2classes, frac = NA)
> Error: sits_sample: invalid 'x' parameter (NA value is not allowed)

sits_sample(cerrado_2classes, frac =  30)
> Error: sits_sample: invalid frac parameter (value should be <= 10)

We are failing to see what we might be doing wrong. What are the expectations of autotest which are not met by our input parameter tests?

We would appreciate your response.

Best Gilberto

gilbertocamara commented 11 months ago

Dear @maelle @mpadge

Please, could you explain what appears to be an unexpected behaviour of autotest?

Today, I ran autotest twice on version 1.4.2 (dev) of the sits package. The first response had 16 issues (please see the RDS file in From what I could understand from the autotest output, it complains about the expected return values of R functions that are called for side-effects.

I tried to fix some of these problems by considering the recommendations of the tidyverse design guide. In Section 26 ("Side-effect functions should return invisibly"), the guide states: "If a function is called primarily for its side-effects, it should invisibly return a useful output. If there’s no obvious output, return the first argument". See more at

I am assuming that autotest follows the same guidelines. Thus, I included invisible return values in all sits functions that are called for side-effects. Then, I ran autotest again. To my surprise, it flagged 48 issues. Please see the second autotest output at

Could you please help me and explain why autotest increases its number of issues from 16 to 48? Your help will be most appreciated.


# install dev version
# enable examples and tests
Sys.setenv("SITS_RUN_EXAMPLES" = "YES")
Sys.setenv("SITS_RUN_TESTS" = "YES")
# first run of autotest
autotest_1 <- autotest::autotest_package(package = "sits", test = TRUE)
# second run of autotest
autotest_2 <- autotest::autotest_package(package = "sits", test = TRUE)

Best regards Gilberto

maelle commented 11 months ago

Hello! I'll get to this later this week, thanks for your patience!

gilbertocamara commented 11 months ago

Dear @maelle @mpadge Begging your indulgence for being insistent, I would like to ask if there is a detailed explanation of the types of diagnostics provided by autotest. Consider the following case. The sits packages deals with big data, processing time series of satellite images. All functions that produce new images need to specify a directory where the results are stored. This is achieved by a parameter called output_dir, which is used in 18 functions, with the same parameter name and the same use.

Out of these 18 instances, autotest produces a diagnostic in only two (2) cases. In both instances, it produces a single_char_case diagnostic. As I understand it, this diagnostic works on the premise that changing the case of a character parameter should yield the same result. Obviously, this expectation cannot be met by operating systems where directory names are case-dependent.

Since the condition to proceed with the revision for statistical packages submitted to ROpenSci is that autotest should not find any problems with the code (no diagnostics, no warnings, no errors), I am at a loss on how to proceed. Please advise on what can be done in this case.

Please also explain why autotest only flags this condition in 2 out of the 18 cases where the parameter output_dir is used.

Many thanks for your help, Gilberto

#> ── autotesting sits ──
#> ✔ [1 / 19]: sits_clean
#> ✔ [2 / 19]: sits_cluster_clean

#> ✔ [3 / 19]: sits_cluster_dendro
#> ✔ [4 / 19]: sits_cluster_frequency
#> ✔ [5 / 19]: sits_config_show
#> ✔ [6 / 19]: sits_labels
#> ✔ [7 / 19]: sits_pred_features
#> ✔ [8 / 19]: sits_pred_normalize
#> ✔ [9 / 19]: sits_pred_references
#> ✔ [10 / 19]: sits_pred_sample
#> ✔ [11 / 19]: sits_predictors
#> ✔ [12 / 19]: sits_reclassify
#> ✔ [13 / 19]: sits_sample
#> ✔ [14 / 19]: sits_select
#> ✔ [15 / 19]: sits_select
#> ✔ [16 / 19]: sits_stats
#> ✔ [17 / 19]: sits_timeline
#> ✔ [18 / 19]: sits_to_csv
#> ✔ [19 / 19]: sits_validate
maelle commented 11 months ago

@gilbertocamara regarding the output you mentioned in your comment, can you confirm it's gone? I don't see that exact error (I'm going through your comments chronologically).

gilbertocamara commented 11 months ago

Dear @maelle, above you have shown the autotest output without running the actual tests. My comments above refer to the output with the parameter test set to TRUE. This is the result that counts.

maelle commented 11 months ago

I would like to ask if there is a detailed explanation of the types of diagnostics provided by autotest

Good question. To me the best answer is currently, does it help? I opened an issue in autotest because I agree the documentation could be improved on this front

maelle commented 11 months ago

currently actually running the tests :sweat_smile: sorry about that

maelle commented 11 months ago

Regarding the flagging of 2/18 functions, obviously I'll have a better idea once I have the results locally, but since autotest works by scraping examples, this might be due to different examples in these 2 functions?

maelle commented 11 months ago

I updated the results I get. Are they the same as on your machine @gilbertocamara?

gilbertocamara commented 11 months ago

Unfortunately, no. Please wait a little bit. Yesterday, we made some changes to sits trying to match the expectations of autotest. We are currently running the latest test. Please give me until the end of the morning BRT to provide you with an update.

maelle commented 11 months ago

Ok. My answer might have a few days delay depending on my availability but I'll do my best. I'll re-read autotest docs. :grin:

mpadge commented 11 months ago

@gilbertocamara Please accept my apologies as lead developer of autotest for issues here. I am currently away on holidays until start of August. Thank you for taking the autotest procedures and results so seriously, but please note that autotest is currently recommended and not required infrastructure. As such, its output should currently be considered (nothing more than) a useful guide to increase general robustness and documentation of packages prior to submission. It is not necessary for packages to completely pass autotest in order for submissions to proceed.

In short: Please use autotest to help improve your package as much as possible. Once you are satisfied, feel free to ignore any remaining autotest issues and proceed on to documentation of statistical standards compliance.

That said, please also feel free to open any issues in the autotest repository, or to ask any further questions there. The package will undergo a major revision hopefully sometime later this year, which will include numerous improvements in functionality, documentation, and general useability. Again, thank you for engaging so sincerely with these results, and apologies for any confusion during the process.

gilbertocamara commented 11 months ago

Dear @mpadge Many thanks for your response, even during your holidays. We will follow your recommendations and proceed to SRR once we consider we have followed all the relevant recommendations of autotest.

mpadge commented 4 months ago

Any updates @gilbertocamara? As said, don't worry too much (or indeed at all) about autotest, but we would like to proceed with your submission :+1:

gilbertocamara commented 2 months ago

Dear @mpadge Apologies for the long delay in responding. First of all, kudos to ROpenSci for your work! In the last two months, we have been working on release 1.5.0 of the sits package which is due April 30th. In this release, we have tried to incorporate in sits the guidelines for Statistical Software proposed by ROpenSci in connection with the srr package. In particular, we took the guidelines associated with machine learning, spatial and time series packages.

A key point here is that arguably sits is currently unique in the R package landscape. It provides an end-to-end environment for ML/DL analysis of big Earth observation. We are not aware of any similar package. Thus, many recommendations that would apply for packages that implement improved versions of existing ML algorithms do not apply. A second point is that there is a full book on sits available in, which allows users to perform extended tests and experiments with medium-sized datasets that cannot be loaded in CRAN. We also provide large data sets in github to serve as basis for user experiments.

The sits package is supported by a set of packages available in the github repository ( which include sitsdata (medium-sized data sets used in the book) and rondonia20LMR a data set of 28 GB to test ML methods for image time series in a big data context.

Overall, we found the guidelines to be very useful. To avoid lengthy issues, I will post my thoughts on the SRR guidelines in a set of comments below

gilbertocamara commented 2 months ago

Dear @mpadge Some comments on the SRR Generic Guidelines.

Generic guidelines are quite good points, especially regarding documentation and error messages associated with variable checking. They encouraged us to:

(a) improve documentation of internal functions (G1.4a); (b) provide a statement (G1.2); (c) include assertions on all input parameters (G2.0); (d) handle missing data and provide explicit imputation function (G2.14); (e) include specific messages for each different error, including indicating parameter names (G5.2); (f) include tests that support a 94% code coverage (G5.4); (g) include edge-condition tests and associated messages (G5.8);

We could not fully understand the scope of G5.6 (parameter recovery) and G5.9 (noise susceptibility tests) so we consider that they do not apply to `sits.

We missed explicit support in the SRR Guidelines regarding the tidyverse. While we understand that there is resistance to tidyverse in certain quarters, in software engennering terms the tidyverse is much better than tradional R *apply methods for data handling. We could not have developed a reliable and efficient package without the tidyverse.

gilbertocamara commented 2 months ago

Dear @mpadge comments on SRR Guidelines on Machine Learning

From the perspective of the sits package and Earth observation in general, the first part of the ML guidelines (ML1.0 to ML1.5) seems to have as an excessive focus on the differentiation between training and test data. In the case of EO data, R packages have to deal with CRAN limitations on example data sets. We tried to overcome this limitation by providing a specific chapter in the on-line book ( and with additional packages with are available in github, as explained above.

Guidelines ML1.6 to ML1.8 deal with missing values. They were useful for us as reminders, taking in account that missing values in EO data arise in a different context than tabular data.

We consider guideline ML2.0 quite important. In sits, we actually developed a single interface which encapsulates different models using closures. In fact, the guideline ML2.0 is better elaborated in ML4.0 and later in ML5.0. Could one consider merging them?

We also agree with ML2.2, although we have not yet implemented it. That said, we did not understand the difference between guidelines ML2.2, ML2.3, and ML2.4, ML2.5, and ML2.6. Perhaps they could be grouped together for brevity's sake. We did not understand the context of ML3.0 and its subpoints. We fail to see in which context such separation between specification and training might be useful. This is probably due to the specific nature of EO data analysis.

As for item ML3.4 and subitems, we consider that requiring developers to provide functions for tuning hyperparameters might be a better approach, especially with deep learning.

Although we understand the rationale for ML6.0, we advise against using training and test data for model assessment in the case of EO data. The community has developed a specific set of best practices of quality assessment. See Olofsson et al., (2014) <doi:10.1016/j.rse.2014.02.015>.

In short, the specific case of applying ML/DL for EO data has issues which are impossible to cover in the scope of generic guidelines as provided in SSR for ML.

gilbertocamara commented 2 months ago

Dear @mpadge Comments on SRR Guidelines on Spatial and Time Series.

The SSR Spatial Guidelines are very good and generally applicable. The emphasis on the sf package (SP2.1) is welcome. However, we missed guidelines on handling raster data. In our work, we found that terra to be better and easier to use than stars. In any case, SRR should discourage the use of raster as these packages have superseded it. We also missed guidelines regarding visualisation of vector and raster data. For your reference, we found that tmap, leaflet and leafem to be excellent packages. Note that both tmap and leafem require raster data to be handled by stars. In sits, we use stars for plotting and visualisation, and terra for access to raster values.

We suggest the inclusion of a guideline regarding the installation of GDAL and PROJ, following the instructions associated with the sf package. See more at We also suggest that you consider mentioning the desirability of combining sf with the tidyverse. As acknowledged by Edzer Pebesma, the design of sfhas been influenced by the tidyverse to the extent that some functions for tidyverse can be applied to the output of sf ones. Thus, arguably sf users will find it easier to combine it with the tidyverse.

As for time series, we fully agree with guidelines TS1.0 to T2.1c and have implemented them in sits. As for guidelines TS2.2 to TS2.4b, we considered they do not apply, since in general time series derived from satellite data are not stationary. Also, since satellite image series analysis is about classification and prediction rather than forecasting, we considered that guidelines TS3.0 to TS4.7c do not apply for sits.

gilbertocamara commented 2 months ago

Dear @mpadge Final considerations and a plea for support.

Overall, the SRR Guidelines deserve high praise. The ROpenSci team has provided an excellent service to the community by working hard to develop them. While the guidelines are aimed at small, focused R packages, they are also relevant to larger packages such as sits.

We thus would like your advice on how to proceed. We still consider that a software review of sits would be of much value to us. However, we recognize that sits may fall outside of the scope of the ROpenSci review process. If you are willing to go ahead and revise the package, we would be most appreciative. Should you consider that such review would be cumbersome to the ROpenSci community, we will fully understand your position.

Whathever the case, warm congratulations and thanks to the ROpenSci community!

mpadge commented 2 months ago

@gilbertocamara Thank you so much for your considered and very deep engagement with our statistical standards. The first thing I would like to ask would be for you to copy the above comments into separate issues within the repository for our Statistical Standards Book - one for the Machine Learning and one for Spatial standards. We'll then incorporate your excellent feedback there via updates to our standards.

We are definitely keen to progress with peer-review here. The sits package is definitely within scope. I imagine the only problems sits may pose will be extra burden on reviewers of such a very large and comprehensive package. But we are definitely excited to learn from guiding this kind of package through our system, and from hopefully mutually beneficial feedback from both sides throughout the process.

Our current Editor-in-Chief @jooolia will take it from here. Thank you for all of your work!

gilbertocamara commented 2 months ago

Dear @mpadge Many thanks for your response. I will include the relevant part of my comments as issues in the github repository for the book. I shall be waiting for instructions from @jooolia on how to proceed.

jooolia commented 1 month ago

@ropensci-review-bot check srr

jooolia commented 1 month ago

@ropensci-review-bot check package

Editor check started


Checks for sits (v1.4.2-3)

git hash: 06ab1b32

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL-2

1. Package Dependencies

**NOTE:** Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:
- code in C++ (6% in 15 files) and R (94% in 136 files)
- 8 authors
- no vignette
- 4 internal data files
- 20 imported packages
- 236 exported functions (median 11 lines of code)
- 2360 non-exported functions in R (median 7 lines of code)
- 92 R functions (median 11 lines of code)

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. goodpractice and other checks

Details of goodpractice checks (click to open)

#### 3a. Continuous Integration Badges
[![R-CMD-check.yaml](](

**GitHub Workflow Results**

| id|name |conclusion |sha | run_number|date |
|----------:|:-----------|:----------|:------|----------:|:----------|
| 8933087450|R-CMD-check |success |1ccc1c | 358|2024-05-03 |

---

#### 3b. `goodpractice` results

#### `R CMD check` with [rcmdcheck](

R CMD check generated the following note:

1. checking installed package size ... NOTE
   installed size is 18.4Mb
   sub-directories of 1Mb or more:
     extdata   1.7Mb
     libs     14.8Mb
     R         1.1Mb

R CMD check generated the following check_fail:

1. rcmdcheck_reasonable_installed_size

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.

gilbertocamara commented 1 month ago

Dear @jooolia

Many thanks for starting the review process of R package sits.

Regarding the items mentioned:

✖️ Package name is not available (on CRAN): version 1.5.0 will be submitted to CRAN tomorrow. I will inform you when it is accepted.

✖️ Package has no HTML vignettes: The package has a full on-line book (see and so there is no need for HTML vignettes.

✖️ These functions do not have examples: [sits_run_examples, sits_run_tests]. These functions are auxiliary functions, to avoid CRAN checks.

✖️ Package coverage is 0.1% (should be at least 75%). In fact, package coverage is 94% (see SITS includes functions that access cloud services and functions which take a long time to run. We access seven cloud services and some of them may be off-line. For this reason, tests and examples are run off-line. To run tests and examples, please include the following environmental variables:

Sys.setenv(SITS_RUN_TESTS = "YES")

The issues raised by lintr and cyclocomp have been addressed in version 1.5.0

gilbertocamara commented 1 month ago

Dear @jooola, version 1.5.0 of sits is now on CRAN. Whenever you feel appropriate, you can start the software review of the package. Please note that all issues raised by the automated bot have been responded above.

jooolia commented 1 month ago

Thank you @gilbertocamara for the submission and for your explanations to the :heavy_multiplication_x:'s. I am currently looking for a handling editor for this package. Many thanks, Julia

jooolia commented 1 month ago

@ropensci-review-bot assign @mpadge as editor

