Closed gringer closed 3 years ago
Problem 1:
I had the wrong ID in my JSON configuration file. The cluster
config setting was set to res.1
, but it should have been OL_clusters
.
This changed the output error:
Warning in mapvalues(as.integer(assign_clust[, 2]), from = 1:length(colors), :
NAs introduced by coercion
The following `from` values were not present in `x`: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23
Warning: Error in : Column 5 must be named.
80: <Anonymous>
Error : Column 5 must be named.
Hmm... looks like the code is expecting numeric cluster names.
Changing the code from this:
colorVec = mapvalues(as.integer(assign_clust[, 2]), from = 1:length(colors), to = toupper(colors))
to this:
colorVec = mapvalues(assign_clust[, 2], from = unique(assign_clust[, 2]), to = toupper(colors))
Seemed to fix the NA
issue, although I suspect that there's an easier/quicker way via factor / levels
. The Column 5
issue still remains, though:
2020-10-21 04:39:24 INFO::loading data...
Warning: Error in : Column 5 must be named.
80: <Anonymous>
Error : Column 5 must be named.
The code is also expecting the dimensional reduction to have two dimensions... and I did a three-dimensional UMAP reduction.
Changing the code from this:
df_plot = cbind(full_embedding, assign_clust[, 2], colorVec)
to this:
df_plot = cbind(full_embedding[,1:2], assign_clust[, 2], colorVec)
Has fixed up that issue, which gets past the error that started this issue!
The dimension issue: right now only 2D visualization is supported. The best solution is probably to run a 2-dimensional UMAP for the purpose of visualization (otherwise the information along the 3rd dimension will not be shown on the plot).
About the mapvalues
issue, I think your change looks good. Basically, the from
parameter should be the unique values of the original array. Somehow the code worked without problem for Seurat 2 due to the way they set Idents.
I'm not sure what caused the Column 5
error. Hope you find a solution!
The Column 5 error was due to the dimensionality of the reduction. Taking the first two dimensions fixed this error.
@gringer were you able to get the app up and running? Let me know if there are other problems.
Yes, I've got it working, thanks. My other issues were due to not properly reading the instructions about the input / config file format. I discovered that both the "diff_ex_cluster" and "diff_ex_file" parameters were mandatory, and that creating the differential expression file wasn't too taxing. Here's the code I used, just in case it's useful / helpful:
Idents(hc.no11) <- hc.no11[["OL_clusters"]]; # this is the ident I specified in the config file
map_dfr(as.character(unique(Idents(hc.no11))), function(x){
cat(x,"\n");
FindMarkers(hc.no11, ident.1=x) %>%
rownames_to_column("gene") %>%
mutate(cluster=x);
}) -> marker.DE.tbl;
write_csv(marker.DE.tbl, "/mnt/BigBirdStornext/R_data/RStudioProjects/Seurat/JACI/DEtable_10k_no11.csv.gz");
And my config file, for completeness:
{
"data": [
{
"name": "JACI SCT_10k_400epochs_noCl11",
"file": "/app/data/JACI/JACI_10k_400epochs_no11.rds",
"cluster": "OL_clusters",
"embedding": "umap",
"diff_ex_cluster": "T-cell_1",
"diff_ex_file": "/app/data/JACI/DEtable_10k_no11.csv.gz",
"pt_size": 2,
"font_scale": 0.75
}
],
"config": {
"ui_title": "Single Cell Browser",
"title_link_text": "FR Group",
"title_link_url": "http://www.malaghan.org.nz"
}
}
Glad to hear it's working!
I'll think about improving the instructions for the differential expression file (maybe by showing an example table or something).
I'm getting an error when loading my dataset into the shiny cell browser. I've created this issue as a record of my attempts to fix the problem(s). I'm concentrating on the labelling methods here, because that's my current understanding around the cause of this issue.
The dataset was created using the following command:
The resulting file is about 1.2 GB in size. It contains preferred cell labels as a feature with the name "OL_clusters", which is a naming of clusters/Idents created using the following command:
The difference between
hc.all
andhc.no11
is thathc.no11
doesn't include cluster 11, which was found to have odd expression profiles and excess spreading in the UMAP plot:An Excel table was used to match clusters with their preferred labels: