Address comments for Bioconductor sumbission

These are the comments from the Bioconductor reviewer that we need to address (from Bioconductor/Contributions#1504)

build report

[x] strongly recommend unit test to check for correct functionality of the package.

gitignore

[x] Please also git ignore any extra directories or files for other applications. Like the dev directory, _pkgdown.yml file, pkgdown directory
[x] Make sure doing git clone does not include any of those files/directories

README

[x] Are both the Rmd and md files necessary?
[x] Make sure to update to pull from the Bioconductor repository for the official/stable version than your github (for development) It can include both if you like.

DESCRIPTION

[x] Please remove LazyData: true we have rarely found this useful and can slow installation.

NAMESPACE

[x] Since your classes extend SummarizedExperiment, I highly recommend importing the full package rather than importFrom. The full range of functinoality provided by SummarizedExperiments should be automatically available to your objects but not without importing the package.

inst

[x] Please have an inst\scripts that describes what is in and how you created any data in inst/extdata This should include any source information, filtering, etc. It doesn't have to be strictly code, it can be text or sudo-code
[x] Citation file should be able to be read independent of package installation. The current CITATION will not display properly on Biocoductor. Again readCitationFile(<pathtofile>) should run without error wihtout the package being loaded.

vignette

[x] I suggest renaming the vignette from umi4cats-usage to UMI4Cats or umi4cats as it is much more intuitive for a user to just call the package name to get the vignette (ie. vignette('UMI4Cats')
[x] Why not run with the sample data you provide with downloadUMI4CexampleData() I haven't looked at the R code yet for this function but by default it should use a temp directory not the users working directory to download data. I'm wondering if you implemented any caching mechanism so users would not have to download multiple times? I would suggest looking at BiocFileCache.
[x] All defined output directorys should write to a tempdir for demo purposes (i.e any use of out_path in the vignette)

R/man

[x] downloadUMI4CexampleData should cache the tar files to save time downloading in the future.

Originally posted by @lshep in https://github.com/Bioconductor/Contributions/issues/1504#issuecomment-642861422

build report

[x] strongly recommend unit test to check for correct functionality of the package.

Thanks for this recommendation. We have currently added unit tests using the testthat package for many package functions. We are planning to continue implementing them for the most complex functions to reach a high coverage.

gitignore

[X] Please also git ignore any extra directories or files for other applications. Like the dev directory, _pkgdown.yml file, pkgdown directory.

[X] Make sure doing git clone does not include any of those files/directories

We removed all non-standard directories and files from the git repository and made sure they are no longer included when doing git clone.

README

[X] Are both the Rmd and md files necessary?

We removed the Rmd README and kept the md file.

[X] Make sure to update to pull from the Bioconductor repository for the official/stable version than your github (for development) It can include both if you like.

We updated the install instructions to pull from Bioconductor, and kept our github as source for the development version.

DESCRIPTION

[X] Please remove LazyData: true we have rarely found this useful and can slow installation.

This has been removed from the DESCRIPTION file.

NAMESPACE

[X] Since your classes extend SummarizedExperiment, I highly recommend importing the full package rather than importFrom. The full range of functinoality provided by SummarizedExperiments should be automatically available to your objects but not without importing the package.

We removed the importFrom statements regarding the SummarizedExperiment package, and now the full package is imported. Additionally, SummarizedExperiment is also listed as Depends, to make all the functionality directly available.

inst

[X] Please have an inst\scripts that describes what is in and how you created any data in inst/extdata This should include any source information, filtering, etc. It doesn't have to be strictly code, it can be text or sudo-code.

We have included two script files in the above-mentioned directory:

inst/scripts/CIITA_generate_reduced_fastq.R contains the code to generate the reduced fastq files used as examples (which are donwloaded with the function downloadUMI4Cexamples()).
inst/scripts/CIITA_process_example.R contains the code to process the above-mentioned fastq files and obtain the files located at inst/extdata/CIITA/.

[X] Citation file should be able to be read independent of package installation. The current CITATION will not display properly on Biocoductor. Again readCitationFile(<pathtofile>) should run without error wihtout the package being loaded.

We have updated the CITATION and made sure that readCitationFile(<pathtofile>) runs without errors without the package being loaded.

vignette

[X] I suggest renaming the vignette from umi4cats-usage to UMI4Cats or umi4cats as it is much more intuitive for a user to just call the package name to get the vignette (ie. vignette('UMI4Cats'))

We renamed the vignette to UMI4Cats.

[X] Why not run with the sample data you provide with downloadUMI4CexampleData() I haven't looked at the R code yet for this function but by default it should use a temp directory not the users working directory to download data. I'm wondering if you implemented any caching mechanism so users would not have to download multiple times? I would suggest looking at BiocFileCache.

downloadUMI4CexampleData() has been modified to take advantage of BiocFileCache and also to use temporary directories by default.

Regarding the data used to run the vignette, we've added a paragraph describing the different types of data that can be downloaded using the downloadUMI4CexampleData() function, that is: 1) A full dataset including fastq files with 200K reads and 2) a reduced dataset with small fastq files containing 100 reads. Thus, the user can download and test the package with the full dataset, while the reduced one can be used to run the "Processing" part of the vignette, as the different functions included in this part are more time consuming when using large fastq files. Then, in the analysis and visualization part, the user can either use its own processed files with the full dataset or the processed files we provide as part of the package in /inst/extdata/CIITA.

The code used for generating both fastq files and the processed files is also included in inst/scripts.

[X] All defined output directories should write to a tempdir for demo purposes (i.e any use of out_path in the vignette

All the output directories used in the vignette now point to temporary directories using tempdir().

R/man

[X] downloadUMI4CexampleData should cache the tar files to save time downloading in the future.

As mentioned above, downloadUMI4CexampleData now takes advantage of the BiocFileCache package and thus caches the content to save downloading times.

Pasquali-lab / UMI4Cats