Questions for running split-seq docker

dmsalsgh97 commented 3 years ago

Hi, first. thanks for developing a wonderful pipeline for scRNA-seq!

I'm trying to use docker image on docker hub it works pretty well, but it always exited, even I used -itd option. Is there something wrong with my trials?

I'm trying to use the docker image with r-studio and jupyter-notebook for integrating split-seq data for Seurat and scanpy..

yjzhang commented 3 years ago

It might be because I haven't updated the image in 2 years... Anyway I just pushed a new image. Maybe it will work now?

dmsalsgh97 commented 3 years ago

Umm, I tried it, but -it option seems not working on your docker image. When I run your docker image, the container exits with the following output..

Anyway, your code is working well for SRR6750042 with chemistry v1. Thanks for reply!

yjzhang commented 3 years ago

Okay, so it turns out that you need a --entrypoint argument because the Dockerfile has a predefined entrypoint:

docker run -it --entrypoint /bin/bash ayuezhang27/split-seq-pipeline

dmsalsgh97 commented 3 years ago

Oh it works!

thanks for your kindness

worker000000 commented 3 years ago

thanks a lot. I also want to ask a question about the docker version.
1 the option -chemistry seems to comes from the wet experiment, some how should I set that, the splitseq also has this? I see it in 10x genomics 2 in the part of Running the pipeline, it has 2 fatsq, why has 4 sample names? how the names comes, and wells like A1:B6 and other, how can I get such information

3 you have a part of Merging Sublibraries into a Single Matrix, so does it mean I ran each paired fastq through the part of Running the pipeline, and then doo step Merging Sublibraries into a Single Matrix,?

4 can this processed data be easily passed to seurat? if can, has you tried which version of seurat( it has 4 versions now). and what file should be passed through.

dmsalsgh97 commented 3 years ago

Hi worker,

chemistry option determines the barcode sequences, etc. of split-seq protocol So, you must know which chemistry version the data was produced.
In my think, sample_name is for further analysis(like spatial, or multiplexing) using BC1 and Well position relationships.
you can use the output data for Seurat. when you run split-seq-pipeline all, there are 3 output files

by using R Matrix package, you can set the format proper for Seueat. below is my R script.

library(Matrix) library(Seurat)

dat = readMM('your file location for DGE.mtx') colnames(dat) = read.table('your file location for genes.csv',sep=',',header=TRUE)$gene_name rownames(dat) = read.table('your file location for cell_metadata.csv',sep=',',header=TRUE)$cell_barcode dat <- t(dat) #For transpose DGE matrix for seurat

Seurat_test = CreateSeuratObject(dat, names.delim='-', min.cells = 3, min.features = 200)

worker000000 commented 3 years ago

thanks a lot. can you help me with questions3? 2 I see the supplymentary methods of the article of split-seq, it has many prepare steps of fastq data, is there any tool for doing this?

whcih version of seurat are you using? 3 or 4

dmsalsgh97 commented 3 years ago

Hi worker,

The authors uploaded computational methods on this Github page, and by running the pipeline with all mode, you can get the gene counts for each cell (N x k matrix) for further analysis like Seurat.

I used the 3.1.5 version of Seurat.

worker000000 commented 3 years ago

thanks

dmsalsgh97 commented 3 years ago

Well, it seems like your docker disk mount seems making the problem '/sc/fastq_test//singlecells... ' And this is the Github page because Alexander B. Rosenberg is one of the contributions...

worker000000 commented 3 years ago

thanks a lot. 1 after reading some paper, I do not find it takls about --chemistry v2 , do you know how to set this option. I think we can ignore it. 2 sample seems to just be important in split-seq, do you know the split-seq paper well position of the samples

yjzhang / split-seq-pipeline

Questions for running split-seq docker #25