GreenleafLab / chromVAR

chromatin Variability Across Regions (of the genome!)
https://greenleaflab.github.io/chromVAR/
Other
153 stars 37 forks source link

plotDeviationsTsne "subscript contains invalid names" error #24

Open eliduong opened 6 years ago

eliduong commented 6 years ago

Hi Alicia,

I'm trying to read in annotations from a bed file where column 4 contains the group name (EPI, ENDO, IMM, MES) indicating to which cell type the annotation belongs.

When I try to make a plot using: my_annotation_files <- "human_genes_list.bed.txt" anno_ix <- getAnnotations(my_annotation_files, rowRanges = rowRanges(counts_filtered), column = 4) tsne_results <- deviationsTsne(dev, threshold = 0.9, perplexity = 30, what = c("samples", "annotations")) tsne_plots <- plotDeviationsTsne(dev, tsne_results, annotation_name = "EPI", shiny = FALSE)

I get this error: Error: subscript contains invalid names

Can you help me figure out why I am getting this error? Do the group names have to be in a certain format?

Thank you! Elizabeth

AliciaSchep commented 6 years ago

@eliduong Thanks for bringing up this issue. Could you share what head(rownames(dev)) and what head(rowData(dev)) return?

eliduong commented 6 years ago

@AliciaSchep Sorry for the delayed response Alicia. Our server coincidentally went down that day until now.

Here is the info that you requested:

head(rownames(dev)) [1] "ENDO" "EPI" "IMM" "MES" head(rowData(dev)) DataFrame with 4 rows and 2 columns fractionMatches fractionBackgroundOverlap

1 0.003930393 0.012111959 2 0.000970097 0.003917526 3 0.002190219 0.007397260 4 0.004800480 0.009916667

I appreciate your help!

AliciaSchep commented 6 years ago

Hi @eliduong, thanks for info, it was helpful and I think I figured out the source of the error... I will fix bug and will update here when the fix has been pushed.

eliduong commented 6 years ago

Thanks @AliciaSchep !

AliciaSchep commented 6 years ago

Okay I think this bug should be fixed now; can you try re-installing (from Github) and seeing if it works?

eliduong commented 6 years ago

I'm still getting the same error after re-installing chromVAR.

AliciaSchep commented 6 years ago

hmm okay. same exact error message?

AliciaSchep commented 6 years ago

@eliduong Would you be able to share the stack traceback? e.g. from calling traceback() after getting the error?

eliduong commented 6 years ago

@AliciaSchep Sorry again for the delay. Had to work all weekend in the hospital. I did get the same error but when trying it again with shiny = FALSE, I was able to get individual tsne plots for each annotation if I just changed annotation_name = to the different annotation each time. I get the same Error: subscript contains invalid names if I set shiny = TRUE and try to use the interactive plot.

eliduong commented 6 years ago

Hey @AliciaSchep,

After I got the tsne_plots to work with a subset of my data (1000 single cells) using shiny=FALSE for deviationsTsne and plotDeviationsTsne, I tried to include my whole dataset which is about 7000 cells after filtering out for low quality. The variability for my annotations is similar (all >1.0) but now I am getting an error with deviationsTsne that the threshold is too high.

tsne_results <- deviationsTsne(dev, threshold = 1, perplexity = 50, shiny = FALSE) Error in deviationsTsne(dev, threshold = 1, perplexity = 50, shiny = FALSE) : threshold too high

I've tried lowering the threshold to as low as 0.5 and still getting the same error which doesn't make sense to me. Any thoughts?

Sorry for all the questions and thank you for your help! E

AliciaSchep commented 6 years ago

Hi @eliduong thanks for the clarification re error. Makes sense now as I see that the bug that was affecting the non-interactive version would also affect the interactive and I only fixed the non-interactive version.... will need to update the shiny version as well.

Regarding the new error with full data for deviationsTsne... I'm not sure what the cause of this might be. One thought is that perhaps there are NA values that are causing an issue... When there are too few expected reads for a certain annotation for a cell, chromVAR will give an NA value as the deviation. While the computeVariability function will ignore NA values, the deviationsTsne will not and will give NA because the tsne function itself won't handle data with NA values. Thus you could try pre-filtering out cells with NA values and/or checking for NA values in the deviations to see if that might in fact be the cause of the problem.