maximilianh / cellBrowser

main repo: https://github.com/ucscGenomeBrowser/cellBrowser/ - Python pipeline and Javascript scatter plot library for single-cell datasets, http://cellbrowser.rtfd.org
https://github.com/ucscGenomeBrowser/cellBrowser/
GNU General Public License v3.0
102 stars 40 forks source link

Cannot read single sample name from the matrix #258

Open kevin198930 opened 1 year ago

kevin198930 commented 1 year ago

Hello, I am trying to create a cell browser with a Seurat object. I converted the .Robj Seurat object to a .rds object and used the following command: cbImportSeurat -i project_7688.rds -o UCSC_Import --htmlDir=~/public_html/cb . The command then begins to run and FindAllMarkers is also run on the Seurat object. The version the Seurat Object was created under is v4.1.0. I end up with the following error however:

Writing top 100, cluster markers to UCSC_Import/markers.tsv Writing cellbrowser config to UCSC_Import/cellbrowser.conf Prepared cellbrowser directory UCSC_Import Warning message: In ExportToCellbrowser(sobj, "UCSC_Import", "UCSC_Import", : Embedding pca has more than 2 coordinates, taking only the first 2

real 607m2.065s user 602m28.051s sys 4m1.376s INFO:root:Wrote logfile of R run to UCSC_Import/analysisLog.txt INFO:root:Copying project_7688.rds to UCSC_Import/project_7688.rds INFO:root:Not writing UCSC_Import/desc.conf, already exists INFO:root:dataRoot is not set in ~/.cellbrowser.conf or via $CBDATAROOT. Dataset hierarchies are not supported. INFO:root:Determining if ~/public_html/cb/UCSC_Import/exprMatrix.tsv.gz needs to be created INFO:root:~/public_html/cb/UCSC_Import/exprMatrix.tsv.gz does not exist. Must build matrix now. INFO:root:Loading old config from ~/public_html/cb/UCSC_Import/dataset.json INFO:root:Checking and reordering meta data to ~/public_html/cb/UCSC_Import/meta.tsv INFO:root:Reading sample names from ~/Project_7688/Results/UCSC_Import/meta.tsv INFO:root:Reading headers from file ~/Project_7688/Results/UCSC_Import/counts_exprMatrix.tsv.gz ERROR:root:Could not read a single sample name from the matrix. Internal error

I am not sure why it can not read a single sample name from the matrix. Initially, I thought it was because the sample names originally started with a numeric (ie 11-107). However, I changed the sample names such that they start with a character (ie s11-107). Any help would be greatly appreciated. Thanks!

maximilianh commented 1 year ago

Hi Kevin, thanks for your question.

I've never seen this error message before. Is there a way you could share your expression matrix file counts_exprMatrix.tsv.gz with me (maxh@ucsc.edu) ? Or the entire folder ~/Project_7688/Results/UCSC_Import/ as a zip file? An alternative way to work around this could be the --useMtx option of cbImportSeurat.

The sample names are read with this function, and I don't understand what could go wrong here. Did you look at counts_exprMatrix.tsv.gz yourself, what did it look like?

def readHeaders(fname):
    " return headers of a file "
    logging.info("Reading headers from file %s" % fname)
    ifh = openFile(fname, "rtU")
    line1 = ifh.readline().rstrip("\r\n")
    sep = sepForFile(fname)
    row = line1.split(sep)
    row = [x.rstrip('"').lstrip('"') for x in row] # Excel sometimes adds quotes
    logging.debug("Found %d fields, e.g. %s" % (len(row), row[:3]))
    if len(row)==0:
        errAbort("Could not read headers from file %s" % fname)
    return row
kevin198930 commented 1 year ago

Hi Maximilian,

Thanks for your response. I opened the expression matrix file counts_exprMatrix.tsv.gz after running gunzip and the file is empty. Initially, my Seurat Object my saved as a .Robj file extension and I converted it to a .rds extension. Also, I was looking at running the ExportToCellbrowser() function within R instead of the standard command line cbImportSeurat approach. However, even after loading Seurat and SeuratData within R, I don't see the ExportToCellbrowser() function available. Since the counts_exprMatrix.tsv is empty, something obviously went wrong during the building of the expression counts matrix.

maximilianh commented 1 year ago

cbImportSeurat should understand .Robj the format automatically. Hm, maybe we never documented this...

seurat-data is the wrong package. Is you read the docs https://github.com/satijalab/seurat-data, you can see that seurat-data is about data. You need "seurat-wrappers" https://github.com/satijalab/seurat-wrappers

Either way, something went wrong in cbImportSeurat. Do you still have the log file from the cbImportSeurat run ? Can you share the object with me? (maxh@ucsc.edu)

cbImportSeurat is pretty easy to debug, because it generates an R script. If you still have it, you can run it and look at the output. Usually, it's pretty obvious if something is going wrong.