Closed jenzopr closed 4 years ago
Hello @jenzopr
Three functions are presented to find marker genes, namely ranger_importances.Seurat,
findAllMarkers_ranger and FindAllMarkers (together with its complementary function FindMarkers).
What are the differences between those and when should I use which? Should or can those
functions be used complimentary?
I could specify in the documentation that FindAllMarkers
and FindMarkers
are the high-level functions that most users would be interested in and ranger_importances.Seurat
and findAllMarkers_ranger
are internals that are usually called by these high-level functions. I feel like keeping these functions separate is essential to maintain consistency with the equivalent functions in Seurat
but the addition of details in the documentation and vignettes would be useful.
Having obtained a suggested gating strategy with fit_ctree, how can quality of the fitted tree be
assessed? Which measurement(s) can be used to evaluate the goodness of the tree?
Since the object returned has a predict
method, It could be used to generate a prediction based on a testing/training split on the data. A vignette explaining how to carry that out would be really useful for the end-user.
sctree::FindMarkers
fails only when called before loading the pacakge, which might have to do with how calls its internals. I feel like the most straightforward way to fix it would be to test if WhichCells
is in the search path and throw a more informative error saying that it needs to be loaded. sctree::FindMarkers(
object = Seurat::pbmc_small,
ident.1 = 0,
ident.2 = 1,
test.use = "RangerDE",
verbose = FALSE)
# Error in WhichCells(object = object, idents = ident.1) :
# could not find function "WhichCells"
# Error in inherits(ok, "try-error") : object 'ok' not found
require(sctree)
sctree::FindMarkers(
object = Seurat::pbmc_small,
ident.1 = 0,
ident.2 = 1,
test.use = "RangerDE",
verbose = FALSE)
# importance p_val gene avg_logFC pct.1
# GNLY 0.0530090754 0.00000000 GNLY 5.1338001 0.444
# CD7 0.0961302279 0.00000000 CD7 4.5355550 0.528
# PRF1 0.0436675892 0.00000000 PRF1 4.0809632 0.417
#................
#................
#................
# IL17RA 0.20 1
# SNHG7 0.12 1
# RBP7 0.28 1
Regarding the function throwing an error when MAST is not installed, that is due to how the function handles the missing package in Seurat, nonetheless, it would be easy enough to just add specific error handling for it. Nonetheless, the error propagates well enough to give an informative error message (will add the exception).
Regarding the output of sctree::plot_flowstyle
, I agree with it being a good idea, will definitely implement that. I guess that handling it in a similar way that plot_gates
does it would be good.
FindAllMarkers
, FindMarkers
, ranger_importances.Seurat
, findAllMarkers_ranger
as high or low level functions.WhichCells
is in the search path when calling sctree::FindMarkers
plot_flowstyle
ranger_importances.Seurat and findAllMarkers_ranger are internals that are usually called by these high-level functions
If they are internals, are you sure you want to @export
them?
You can also mark them with @keywords internal
so they don't show up in the API documentation index and are less prone to confuse users.
Update:
@natallah
Nice work! 🎉
Hello @jenzopr and @mschubert
Commit b402fbe9eaaa17c352053f65c953b173c4f0e5f5 further includes changes on this regard with the clarification in the documentation on what functions are intended for end-user and which are internals, indeed it was a good idea to have @keywords internal
added.
@natallah
@jenzopr @natallah @mschubert
Aadded train/test vignette in commit ffa30082ed60cc7dc0d9bdd26197ebaedbaa4bfe
This completes the checklist for this issue, let me know if you believe we can close this issue. Kindest wishes, Sebastian
Very nice! I really like the Evaluating Tree Quality
vignette!
A last small request from my side: Can you update the introduction of the README.md
file and include a bit more from the papers Summary to fulfill the statement of need from the JOSS checklist: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
You basically have this nicely in the paper, but not yet in the Readme and webpage.
Thanks!
@jspaezp @jenzopr
I updated the README.md accordingly! Great suggestion - thanks!
This issue is part of the JOSS review.
Summary
The example usage sections (entitled with Usage and Seurat Interface in the README.md file) contain detailed code examples for getting started with the package and are fine-grained enough to easily follow along. I was able carve out the following basic analysis workflow:
However, the presented order of functions complicate understanding of the intended analysis workflow and needs improvement (see major and minor issues below).
Major issues
ranger_importances.Seurat
,findAllMarkers_ranger
andFindAllMarkers
(together with its complementary functionFindMarkers
). What are the differences between those and when should I use which? Should or can those functions be used complimentary?fit_ctree
, how can quality of the fitted tree be assessed? Which measurement(s) can be used to evaluate the goodness of the tree?Minor issues encountered during testing
RangerDE
is not a valid option fortest.use
insctree::FindMarkers()
. The error message isUnknown test: RangerDE
.sctree::FindMarkers()
fails withError in PackageCheck("MAST", error = FALSE): could not find function "PackageCheck"
, when usingtest.use = "MAST"
and MAST is not installed. Consider usingrequireNamespace("MAST", quietly = T)
to check if MAST is installed (Ref: http://r-pkgs.had.co.nz/namespace.html). If not, print an informative message to the user that MAST is required for this function.sctree::plot_flowstyle
easily becomes non-readable, when there are more than 8 clusters / colors. IMHO the possibility to highlight cells from a cluster of interest would enhance the usage with many clusters.