constantAmateur / SoupX

R package to quantify and remove cell free mRNAs from droplet based scRNA-seq data
253 stars 34 forks source link

Error loading 10x data #45

Closed anaccsilva closed 4 years ago

anaccsilva commented 4 years ago

Hello, I am pretty new to this R world, but I am trying to clean up some data and found SoupX to be the solution. I am trying to follow the vignette provided with the package using my own data and as soon as I tried to run "sc = load10X(dataDirs)", I get the following error:

sc = load10X(dataDirs) Loading raw count data Error in full.data[[1]] : subscript out of bounds

Can someone help me figuring out the problem, please? Thanks

julicudini commented 4 years ago

I got this error when I tried to give the path to the raw_feature_bc_matrix directory instead of the top-level directory that contained both the raw_feature_bc_matrix and filtered_feature_bc_matrix. This was simply fixed by removing raw_feature_bc_matrix from the path. And then I had to also remove the backslash from the end of that path or it fails later down the line too.

anaccsilva commented 4 years ago

Hi @julicudini , thanks for your comment. I didn't put the whole code before, but I gave the path to the top-level directory containing both raw and filtered files as you said.

dataDirs = c("soup_x_msg11/") sc = load10X(dataDirs)

I am wondering if it is a problem with the file. Each one, raw and filtered folders, should have 3 files in each, correct? Barcodes.tsv.gz, features.tsv.gz and matrix.mtx.gz?

constantAmateur commented 4 years ago

Hmm, that does sound like a problem with the files or folder structure of your data. It could also be to do with the meta-data that SoupX tries to automatically load when using 10X data.

Can you try loading the data manually without the helper function with:

tod = Seurat::Read10X('soup_x_msg11/raw_feature_bc_matrix')
toc = Seurat::Read10X('soup_x_msg11/filtered_feature_bc_matrix')
sc = SoupChannel(tod,toc)

and see if you still get an error?

anaccsilva commented 4 years ago

Hi, thanks. It worked. Sorry for this next question, but as I said previously, I am pretty new to this. I don't know exactly where to restart with this object following the Vignette steps. I tried few options but none worked.

constantAmateur commented 4 years ago

You'll need to add clustering information with setClusters. You will need to generate clustering using some other package, such as Seurat. Usually this is loaded from cellranger, but I suspect there is something non-standard in your analysis folder that is causing load10X to fail. Look at the help for setClusters for details how to do this.

If you want to do any of the visualisation steps in the vignette you also need to provide some kind of dimension reduction using setDR. This is not essential, but be aware that some of the visualisation examples in the vignette won't work without it.

After that you should be able to proceed from the section starting "Visual sanity checks" in the vignette.

anaccsilva commented 4 years ago

Hello, thanks again for your help. Just an update. As we work with a core, all the information that we needed was with them. So they helped me to get the whole cellranger outs folder and it worked just fine. Thanks a lot!