farrellja / URD

URD - Reconstruction of Branching Developmental Trajectories
GNU General Public License v3.0
117 stars 41 forks source link

aucprTestByFactor #8

Closed hgb1111 closed 5 years ago

hgb1111 commented 6 years ago

I've run through URD with a single cell dataset, and I am now attempting to analyze some of the branch points in the tree. I have successfully used the aucprTestAlongTree to find genes characteristic of a specific lineage, but now want to find out the genes upregulated along each branch at a specific split.

Following the format found at: https://rdrr.io/github/farrellja/URD/f/Analyses/SupplementaryAnalysis/URD-06-GeneExpressionCascades.Rmd

I ran:

routeA <- cellsInCluster(axial.tree, "segment", c(35, 32, 28)) routeB <- cellsInCluster(axial.tree, "segment", c(35,32, 13)) groups <- c( 1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 20, 21, 22, 23, 24, 25, 3, 4, 5, 6, 7, 8, 9) aucprTestByFactor(axial.tree, cells.1 = routeA, cells.2 = routeB, label = "segment", groups = groups)

and received the error:

Error in [<-.data.frame(*tmp*, is.na(genes.data$AUCPR), "AUCPR", value = 0) : replacement has 1 row, data has 0

I tried different ways of identifying cells.1 and cells.2, a couple different group.ids as the label, and a couple different groups, but received the same error each time. I am not extremely familiar with R, and don't know exactly how to fix the issue.

For reference, the "groups" listed above are just the numbers of each of the dataset "stages", which were extracted from the cluster identities of several merged Seurat objects. Every cell in the object should have a number in this stages column in group.ids.

farrellja commented 6 years ago

Hi hgb,

Try making the following changes:

First, you need to provide groups as a list rather than a vector (groups <- list(1, 10, 11, 12, ...)). That would compare each stage one at a time. You can also group them with a list of vectors to group stages (groups <- list(c(1, 10, 11), c(12, 13, 14), ...) would compare stages 1, 10, and 11 against 12, 13, and 14.

Also, you should check the class of the values in "stages" (class(axial.tree@group.ids$stages)) because you may need to provide character values rather than numeric ones: groups <- list("1", "10", "11", "12", ...)

Lastly, since you said that the groups refer to the stages column of group.ids, then you'll want to set label="stages"

Let me know if that helps -- Jeff

hgb1111 commented 6 years ago

Hi Jeff,

Thanks for the help, I'm still encountering the same error though.

I just ran:

routeA <- cellsInCluster(axial.tree, "segment", c(35, 32, 28)) routeB <- cellsInCluster(axial.tree, "segment", c(35,32, 13)) groups <- list( "1", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "2", "20", "21", "22", "23", "24", "25", "3", "4", "5", "6", "7", "8", "9") aucprTestByFactor(axial.tree, cells.1 = routeA, cells.2 = routeB, label = "stage", groups = groups)

I also ran it with:

groups <- list( 1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 20, 21, 22, 23, 24, 25, 3, 4, 5, 6, 7, 8, 9)

Both times I received back the same error:

Error in [<-.data.frame(*tmp*, is.na(genes.data$AUCPR), "AUCPR", value = 0) : replacement has 1 row, data has 0

Thoughts?

hgb1111 commented 6 years ago

Also,

I'm not sure I entirely understand the purpose of the "groups" function?

However, I ran:

routeA <- cellsInCluster(axial.tree, "segment", c(35, 32, 28)) routeB <- cellsInCluster(axial.tree, "segment", c(35,32, 13)) groups <- list(c(35,32,28), c(35,32,13)) aucprTestByFactor(axial.tree, cells.1 = routeA, cells.2 = routeB, label = "segment", groups = groups)

And it ran for a long time (substantially longer than the previous iteration of this command) and it seemed to be working, but then in the end it popped up with this same error again:

Error in [<-.data.frame(*tmp*, is.na(genes.data$AUCPR), "AUCPR", value = 0) : replacement has 1 row, data has 0

farrellja commented 6 years ago

Hi hgb,

Several things below:

"groups" is to divide up the populations according to some other factor. For instance, instead of considering the cells in routeA against the cells in routeB, but instead want to consider cells in routeA against cells in routeB from Stages 1 and 2, and then repeat that comparison for Stages 3 and 4, Stages 5 and 6, and so on you would use this function. "groups" lets you define how to split the population up according to the label that you refer to. groups=list(c("1","2"), c("3","4"), c("5","6")) would accomplish that, while groups=list(c("1","2","3"), c("4","5","6")) would instead compare the stages three at a time.

If you simply want to compare cells from routeA against cells from routeB with no further subsetting, try the function markersAUCPR. It is the underlying function that is called repeatedly by aucprTestByFactor.

I was not able to reproduce this error easily. Can you check for me the number of cells from each group in your two populations: table(axial.tree@group.ids[cellsInCluster(axial.tree, "segment", c(35, 32, 28)), "stage"]) I am curious whether some groups are absent or only contribute 1 cell. If that is the case, then maybe you should leave those labels out.

If the problem persists, can you share either a reproducible example or share your workspace with me so that I can troubleshoot?

Thanks, Jeff