Bioconductor / Contributions

Contribute Packages to Bioconductor
131 stars 33 forks source link

scTensor #884

Closed kokitsuyuzaki closed 5 years ago

kokitsuyuzaki commented 5 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 5 years ago

Hi @kokitsuyuzaki

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: scTensor
Type: Package
Title: Detection and visualization of cell-cell interaction within single-cell RNA-Seq data
Version: 0.99.0
Date: 2018-09-25
Author: Koki Tsuyuzaki
Maintainer: Koki Tsuyuzaki <k.t.the-answer@hotmail.co.jp>
Depends: R (>= 3.5.0)
Imports: methods, igraph, S4Vectors, plotly, AnnotationDbi, reactome.db, SummarizedExperiment, SingleCellExperiment, nnTensor, rTensor, LRBaseDbi, abind, plotrix, heatmaply, tagcloud, RColorBrewer, rmarkdown, BiocStyle, knitr, biomaRt, LRBase.Hsa.eg.db, MeSH.Hsa.eg.db, MeSHDbi, meshr, GOstats, ReactomePA, grDevices, graphics, stats, utils, outliers, snow, Category
Suggests: testthat, RSQLite, LRBase.Mmu.eg.db, MeSH.Mmu.eg.db
Description: The algorithm is based on non-negative tucker decomposition (NTD) of nnTensor.
License: Artistic-2.0
biocViews: DimensionReduction, SingleCell
VignetteBuilder: knitr

Add SSH keys to your GitHub account. SSH keys will are used to control access to accepted Bioconductor packages. See these instructions to add SSH keys to your GitHub account.

bioc-issue-bot commented 5 years ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

4ef0974 v-0.99.2

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

40e2531 v-0.99.3

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

4b8230a v-0.99.4

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

e0c1a9f v-0.99.5

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

kokitsuyuzaki commented 5 years ago

Hi, @LiNk-NY

I've cleared the Single Package Builder. If possible, I'd like to finish this review process until October 5, to publish this package in next Bioconductor 3.8. https://www.bioconductor.org/developers/release-schedule/

Regards,

Koki

LiNk-NY commented 5 years ago

Hi Koki, @kokitsuyuzaki There are a few packages ahead of yours that I am still reviewing. I should be able to get to your package by tomorrow. Best regards, Marcel

kokitsuyuzaki commented 5 years ago

Hi, @LiNk-NY

How about the progress of review of scTensor? If there's anything I can help, please let me know.

Koki

LiNk-NY commented 5 years ago

Hi Koki, Thank you for submitting to Bioconductor. Please see your review below. If you have any questions, feel free to post them in the issue here.

Best regards, Marcel


scTensor #884

DESCRIPTION

NAMESPACE

vignettes/

Minor:

R/

kokitsuyuzaki commented 5 years ago

Thanks for the review! I will correct the places that you pointed out.

LiNk-NY commented 5 years ago

Hi Koki つゆざきさん、 This is just a friendly reminder to respond to the review.

Best regards, Marcel

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

311db28 v-0.99.6

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

kokitsuyuzaki commented 5 years ago

Hi, @LiNk-NY

I checked your comments and modified the corresponding parts as long as possible.

DESCRIPTION

Consider combining Author and Maintainer fields into Authors@R

I merged Author and Maintainer as Authors@R.

Mind the 80 column width formatting guidelines

I couldn't shrink the rows such as Imports:... and Description:... as 80 column, because it caused an error, where error message says "DESCRIPTION file is not invalid DCF format". I shrunk the title much shorter.

NAMESPACE

Remove optional db packages from import directives. Only include these in the Suggests field of the DESCRIPTION file.

I moved LRBase.Hsa.eg.db and MeSH.Hsa.eg.db in Import to Suggests fields.

It appears that you don't need to use cell* prefix for all of your exported functions. Please remove the prefixes.

The character "cellCell" has scientific meaning, that is, if cellCellDecomp is changed as CellDecomp by removing "cell", it can be confused with matrix decomposition method, which is performed for visualization of "cellular" distribution such as PCA, NMF, t-SNE, and UMAP.

My package is based on the tensor decomposition method to extract "cell-cell" interaction and much complicated relationship than just cellular distribution.

Such intension may not be conveyed from CellDecomp.

I think the naming rule of cellCell~ is not so bad. All names of the functions are unique and hardly conflict with the functions of other packages, not as long as SummarizedExperiment() or suppressPackageStartupMessages(), and all of them are camel case.

For the above reasons, I do not want to remove the prefixes.

vignettes/

Please ensure that your vignette and code are within 80 character-width. This improves readability and maintenance.

In some parts of the manuscript such as citation (^[...]), URL, or Table of markdown, the 80-character rule destroyed the format of HTML vignettes.

I shrunk other parts within 80 character-width, as long as possible.

Please remove any commented code and include only running code.

I removed all commented out so that all chunks are evaluated.

Minor: You can avoid showing the suppressPackageStartupMessages by loading the package in a separate chunk and doing echo = FALSE in the chunk options. Then you can use another chunk to show library("LRBaseDbi") with eval = FALSE to get a clean looking code chunk.

I removed suppressPackageStartupMessages() and divided as multiple chunks.

R/

Remove all library calls from AllGenerics.R. Include the extra packages in the Suggests field. Check first if they are available in the user's library with requireNamespace otherwise error out and tell the user to install those packages.

The corresponding parts;

invisible(clusterEvalQ(cl, {
    library("outliers")
    library("S4Vectors")
    library("tagcloud")
    library("plotrix")
    library("plotly")
    library("rmarkdown")
}))

are evaluated in each node generated by snow package, but even if users specify the non-parallel-mode (cl=NULL, default), such packages are required by the cellCellReport function.

That's why these packages cannot be moved to Suggests fields.

I replaced all library() as requireNamespace().

For the series of algorithms, you can create an internal map or use switch to relate keywords to algorithm functions and use that structure to reduce the amount of else if statements that you have.

I'm not sure whether I precisely understand the internal map, but I imagined the code as follows:

flist <- list(
    "A"=function(){print("A")},
    "B"=function(){print("B")}
)
f <- flist["A"]
f[[1]]()
f <- flist["B"]
f[[1]]()
f <- flist[["C"]]
is.null(f[[1]])

For avoiding "else if" statements, I introduced such coding style.

Default out.dir is usually "." or the current working directory

If the out.dir is set as ".", the output directory containing multiple figure files, .Rmd files and .html files are generated in the current working directory, and even if the R process is terminated, the output directory will remain until it is removed.

Thus, I'm concerned about that such generated directory contaminates the working directory, especially you claim that the commented code about cellCellReport should be evaluated.

In the current vignette of scTensor, the output directory is specified as a temporary directory and immediately removed when the R process is terminated, and if the out.dir is not specified by the users, it will be stop, and I think it is clean and safe.

Would you let me hear about your idea?

Why not use ... instead of useobjects and avoid doing eval(parse(...))?

I used such three-dots style.

Avoid using sapply, use the more robust vapply or even lapply

I converted some sapply to vapply. I know the merit of vapply, but vapply is not completely compatible with sapply, especially vapply cannot use simplify=FALSE, and the result has to be same length vectors. That's why I left some sapply in some parts.

Avoid using cat outside of print methods for classes, use message instead

I changed cat() as message().

Use writeLines or capture.output to create the Rmd file instead of sink()

I changed sink() & cat() style as file() and writeLines() style.

Avoid using nested for loops where possible

I removed most of for loop, as long as possible.

You might want to use a lookup table for .ensembl instead of a series of else if statements.

For modified the "else if" statements as "flist" pattern described above.

Consider setting argument defaults with the vector of all the options and do match.arg and take the first by default (see algorithm argument).

I introduced such coding style (all option => match.arg => first as default).

The documentation for cellCellRanks says that mergeas is sum by default but this is not reflected in the usage.

Sorry, the arguments part of the document is wrong. I changed (Default: "sum") to (Default: "mean").

Koki

LiNk-NY commented 5 years ago

Hi Koki, @kokitsuyuzaki

I'm not sure whether I precisely understand the internal map, but I imagined the code as follows:

Yes, you have the right idea with the .flist object. It can be further reduced to just the differences between the algorithm calls such as get(".cellCellDecomp.", algorithm)(...) and handling special cases like pcomb, nmf, and ntd.

Thus, I'm concerned about that such generated directory contaminates the working directory, especially you claim that the commented code about cellCellReport should be evaluated.

Usually, a function shouldn't save files to directories unless specified by the user. Perhaps you can just let the function show the report and allow users to save them later by specifying the dir.out path.

vapply cannot use simplify=FALSE

This is exactly the functionality that you want to avoid in sapply. Your outputs should be predictable and that is why vapply is recommended because you can always expect a vector.

Regards, Marcel

kokitsuyuzaki commented 5 years ago

Yes, you have the right idea with the .flist object. It can be further reduced to just the differences between the algorithm calls such as get(".cellCellDecomp.", algorithm)(...) and handling special cases like pcomb, nmf, and ntd.

Could you explain the details again? I wondering if you mean the get function not as the standard function (https://stat.ethz.ch/R-manual/R-patched/library/base/html/get.html), but as a kind of original accessor function, because standard get requires the second parameter as R-environment.

e <- new.env()
assign("A", 1, envir=e)
get("A", e)

I think you pointed out that this function(... many arguments ...) part is duplicated in all the algorithms and should be removed, right? https://github.com/rikenbit/scTensor/blob/311db28a58be25d4cdbea693a5332752069eb5fe/R/scTensor-internal.R#L3-L4

Thus, I'm concerned about that such generated directory contaminates the working directory, especially you claim that the commented code about cellCellReport should be evaluated. Usually, a function shouldn't save files to directories unless specified by the user. Perhaps you can just let the function show the report and allow users to save them later by specifying the dir.out path.

Yes, I also think that a function should act like that, but the report is html-based and supposed to be used by the user's web-browser. To show the report, .Rmd files firstly have to be generated and compilied to the .html files. Unless all of the .html files are compiled successfully, the report cannot be shown properly. The report also requires many figures to be embed into the .html files. Thus the cellCellReport firstly needs a place to save these files and I want to define the dir.out as a required parameter.

vapply cannot use simplify=FALSE This is exactly the functionality that you want to avoid in sapply. Your outputs should be predictable and that is why vapply is recommended because you can always expect a vector.

What I want as the result here is a list and not a vector. This is because the result can be one-to-many relationship, the number of each element is not always one, and it is hard to be expressed as a vector. That's why I wanted to use sapply with the option simplify=FALSE so that the result is returned as a list. Is this all right if I use lapply instead?

LiNk-NY commented 5 years ago

Hi Koki, @kokitsuyuzaki

Could you explain the details again? I wondering if you mean the get function not as the standard function (https://stat.ethz.ch/R-manual/R-patched/library/base/html/get.html), but as a kind of original accessor function, because standard get requires the second parameter as R-environment.

I was referring to that get function. It will find the function if it has been defined already and use it with the incoming arguments. Perhaps this strategy is too clever and you can decide to leave the code as-is.

Yes, I also think that a function should act like that, but the report is html-based and supposed to be used by the user's web-browser. To show the report, .Rmd files firstly have to be generated and compilied to the .html files. Unless all of the .html files are compiled successfully, the report cannot be shown properly. The report also requires many figures to be embed into the .html files. Thus the cellCellReport firstly needs a place to save these files and I want to define the dir.out as a required parameter.

If you do need to have a default argument, consider using tempdir() as the default.

What I want as the result here is a list and not a vector. This is because the result can be one-to-many relationship, the number of each element is not always one, and it is hard to be expressed as a vector. That's why I wanted to use sapply with the option simplify=FALSE so that the result is returned as a list. Is this all right if I use lapply instead?

Yes, lapply is usually prefered.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

5ca15d6 v-0.99.7

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

f24b8e8 v-0.99.8

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

f586680 v-0.99.9

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

73e671a v-0.99.10

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

e22630b v-0.99.11 (removed parallel functions by snow)

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

8791269 v-0.99.12

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

26acdf5 v-0.99.13

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

kokitsuyuzaki commented 5 years ago

I think the 5 minutes-rule of BiocCheck is too strict and stressful for the developers. Much longer time should be considered. Especially, the windows machine (tokay1) is always slower than the other two server machines and it is always a hard task to finish the check in the machine.

Actually, it takes about twice as much time.

http://bioconductor.org/spb_reports/scTensor_buildreport_20181106055607.html

Why is this machine always so slow?

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

2f041b7 v-0.99.15

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

kokitsuyuzaki commented 5 years ago

Hi, @LiNk-NY

I was referring to that get function. It will find the function if it has been defined already and use it with the incoming arguments. Perhaps this strategy is too clever and you can decide to leave the code as-is.

It is still difficult to me, but anyway I wrote the .flist much shorter.

If you do need to have a default argument, consider using tempdir() as the default.

I set the default value of out.dir as tempdir(). Because the users might not know that the output files are saved at the tempdir, I also added the message to inform the directory.

Yes, lapply is usually prefered.

I changed all sapply as lapply or vapply.

Koki

kokitsuyuzaki commented 5 years ago

Hi, @LiNk-NY

Could you register scTensor to BioC 3.9, if there is no point to be modified?

Koki

LiNk-NY commented 5 years ago

Hi Koki, @kokitsuyuzaki Thank you for your patience. Your package has been accepted. Best regards, Marcel

bioc-issue-bot commented 5 years ago

Your package has been accepted. It will be added to the Bioconductor Git repository and nightly builds. Additional information will be posed to this issue in the next several days.

Thank you for contributing to Bioconductor!

kokitsuyuzaki commented 5 years ago

Thanks, @LiNk-NY !

mtmorgan commented 5 years ago

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/kokitsuyuzaki.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("scTensor"). The package 'landing page' will be created at

https://bioconductor.org/packages/scTensor

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.