Closed finjen closed 3 years ago
Hi @finjen,
I'm also not entirely sure what could be the problem here. Could you provide with the cluster cell counts counts (table(mydata$leiden_cluster
), and the cell names(head(colnames(mydata))
)? Also, you can try do get the expression matrix from the seurat object with GetAssayData()
, transpose it (t()
) and then run voxel_map()
on that, providing the vector of cluster assignments in the groups
argument.
Cheers, Jonas
Hi Jonas, thank you for the quick reply. I tried running it with the transposed matrix using the vector of cluster assignments in the groups, but the output is the very same error. These are the outputs for leidencluster and the colnames: table(mydata$leiden_cluster): 0 1 2 3 4 5 6 7 8 9 10 11 12 13 153 153 148 116 82 67 57 45 43 41 37 32 22 9
head(colnames(mydata)): [1] "AAACGCTTCGGACTGC-1" "AAAGGATTCGGATACT-1" "AACGAAATCATAAGGA-1" "AACTTCTAGATCGCCC-1" "AAGCATCGTAGCCCTG-1" [6] "AAGCATCTCTACACTT-1"
Thank you.
Ok, and just to make sure, are you using the most recent version of VoxHunt? You can find out by calling sessionInfo()
. Also, maybe provide the gene names as well (head(rownames(mydata)
).
The version I am using is voxhunt_0.9.2 (I think that is the newest?) head(rownames(mydata)) gives me the gene names: [1] "Xkr4" "Gm1992" "Gm37381" "Rp1" "Mrpl15" "Lypla1"
I just ran the code again, without change, and receive now the following error:
Error in voxel_map(mydata = Matrix::Matrix(expr_mat, sparse = T), stage = stage, : argument "object" is missing, with no default
The code is still: vox_map <- voxel_map( mydata, stage="E13", group_name= 'leiden_cluster', genes_use = regional_markers )
object = mydata results in "unused argument (object = mydata) error.
Ah ok, so VoxHunt expects human gene symbols (all caps, like XKR4), because I wrote it for human organoids. You can find an ortholog mapping here.
Ah, of course.. I changed the gene names now to human gene symbols. Thank you. However, the error now is still:
Error in voxel_map(mydata = Matrix::Matrix(expr_mat, sparse = T), stage = stage, : argument "object" is missing, with no default
This is because the first argument of voxel_map()
is object
, so I belive the code voxel_map(object = Matrix::Matrix(expr_mat, sparse = T))
shoudl work.
I tried that, also as a second option with the transposed matrix as such: vox_map <- voxel_map(Matrix::Matrix(mydata_t, sparse=T), stage="E13", groups=new_leiden, genes_use = regional_markers)
But I still keep receiving this error:
Error: Must subset rows with a valid subscript vector.
ℹ Logical subscripts must match the size of the indexed input.
x Input has size 49073 but subscript !duplicated(x, fromLast = fromLast, ...)
has size 0.
When I run rlang::last_error() the output is this:
<error/vctrs_error_subscript_size>
Must subset rows with a valid subscript vector.
ℹ Logical subscripts must match the size of the indexed input.
x Input has size 49073 but subscript !duplicated(x, fromLast = fromLast, ...)
has size 0.
Backtrace:
[.tbl_df
(...)
It looks like some of the row- or colnames of the input matrix don't match the expected input. Can you maybe post the row and colnames of mydata_t
? Also you can have a look at the ABA data that is loaded by VoxHunt to check if the gene names match the ones expected by VoxHunt: DATA_LIST[['E13']]$matrix
Since the Allen brain atlas only provides the expression measurements for ~2000 genes, it's also possible that the gene set you selected is not measured in the ABA at all. I would therefore recommend to do feature selection through VoxHunt (structure_markers()
) or intersect your feature set with the genes measures in the timepoint you want to look at (colnames(DATA_LIST[['E13']]$matrix)
)
head(colnames(mydata_t), 20)
[1] "XKR4" "GM1992" "GM37381" "RP1" "MRPL15" "LYPLA1" "TCEA1"
[8] "RGS20" "GM16041" "ATP6V1H" "OPRK1" "RB1CC1" "4732440D04RIK" "ST18"
[15] "PCMTD1" "GM26901" "SNTG1" "RRS1" "AC009879.3" "2610203C22RIK"
head(rownames(mydata_t)) [1] "AAACGCTTCGGACTGC-1" "AAAGGATTCGGATACT-1" "AACGAAATCATAAGGA-1" "AACTTCTAGATCGCCC-1" "AAGCATCGTAGCCCTG-1" [6] "AAGCATCTCTACACTT-1"
Also, among my total 20615 genes almost 1400 are also found in the ABA gene list (88%).
And the regional_markers
you are using, are they also among the genes detected in the ABA?
My understanding was that structure_markers() selects markers from the ABA dataset. I wouldn't know right now how to apply it to my dataset based on the arguments it takes.. Or am I misunderstanding your comment?
Yes, your understanding is right. I was talking about the genes passed to the voxel_map()
function with the genes_use
argument. Those are the genes that are used for correlation and selecting structure-specific genes here results in better contrast in the spatial maps. Also, the genes used for correlation need to be measured in the ABA so if you select correlation features differently, you need to make sure that that is the case. I was just asking because this would be a possible reason for the error.
How does regional_markers
look like?
regional_markers looks like this:
regional_markers
gene group avg_exp fc auc pval padj prcex_self prcex_other
Ah ok, so it should be a vector with genes to use for correlation, not a data frame.
Did this solve your issue?
So, I reran this code: regional_markers <- structure_markers('E18') %>% group_by(group) %>% top_n(10, auc) %>% {unique(.$gene)} head(regional_markers)
And sometimes, I get the dataframe as an output, sometimes a list of genes. I am not sure why there is this variability, but as I can rerun until I get the vector and then can continue, it is fine for now. Thank you for your help with this.
As far as I know, there should not be any variability here, if you run the full block it should output a vector with genes. If you happen to get a dataframe, it for some reason probably did not run the last line. In that case you can obtain the vector of genes by getting the unique entries of the gene
column (which is what the last line does).
Hi, I am trying your code to run voxel_map as such:
vox_map <- voxel_map( mydata, stage='E13', group_name = 'leiden_cluster', genes_use = regional_markers )
mydata is my Seurat object, in which I added the leiden clustering information into the metadata.
However, I am receiving the following error: Error: Must subset rows with a valid subscript vector. ℹ Logical subscripts must match the size of the indexed input. x Input has size 41650 but subscript
!duplicated(x, fromLast = fromLast, ...)
has size 0.To be frank, I do not see where this error is coming from. Would it be possible to receive some input on this?
Thanks.