Closed houruiyan closed 3 years ago
Hello! Thank you for your question.
It sounds like you have some cellranger-aligned bams, and you have not run SICILIAN on that bam, is that correct?
In that case, I think you would want to have the following options:
SICILIAN = false
samplesheet = YOUR_SAMPLESHET_HERE.csv
For 10X data, I would follow the instructions in the first block to create the samplesheet: https://github.com/salzmanlab/SpliZ#samplesheets You should have 2 comma-separated columns:
For the metadata, that file should have at least 3 columns:
cell_id
formatted as ${bamID}${cellranger_barcode}grouping_level_1
the metadata unit over which you would like to perform differential analysis grouping_level_2
the metadata unit that you would like to calculate differential analysisIt is possible that you only have one group over which you'd like to perform differential analysis( #2 ), in which case, you can leave grouping_level_1
blank, and your metadata would look like:
cell_id
formatted as ${bamID}${cellranger_barcode}
grouping_level_2
the metadata unit that you would like to calculate differential analysisAn example I can provide is if you have data from multipletissue
(i.e. lung, kidney, and heart) and multiple cell_type
(i.e endothelial, blood, capillary) within each tissue
.
grouping_level_1 = tissue
and grouping_level_2 = cell_type
, then you would be looking for differential SpliZ in endothelial vs blood vs capillary FOR EACH tissue. grouping_level_2 = tissue
and there is no grouping_level_1
, then you would be looking for differential SpliZ in endothelial vs blood vs capillary, irrespective of tissue. grouping_level_2 = cell_type
and there is no grouping_level_1
, then you would be looking for differential SpliZ in lung vs kidney vs heart, irrespective of cell_type.I hope that helps, and feel free to paste in your config file/metadata/samplesheets to check. And thanks again for your question, I'll update the readme to clarify the parameters a bit.
Thank you very much! Your explanation is very clear! I write the .config file and build the meta data/samplesheet according to your instruction. I think there is also point that should be paid attention. When we use the bam file, we do not need to set value for the "input file". I think it works. This is my meta data.
This is my config.
But there is another new problem appear.
I don't know the point causing this problem. Hope to get your help. Thank you!
Can you please navigate to the 'Work dir' of that failed job, and paste the results of *.log
?
The 'Work dir' path is located in the bottom of your second image, i.e./storage/yhhuang/../work/..
It may also be helpful to paste in a couple lines of your MS_ann_splices.tsv
file.
Dear Dr Chaung,
This is my calc_splizvd.log in the "work dir":
This is the MS_ann_splices.tsv file in my "work dir"
Thank you!
Hi, if the column names of your metadata file are grouping_level_1
and grouping_level_2
, then your config file should have:
grouping_level_1 = grouping_level_1
grouping_level_2 = grouping_level_2
ok, thank you very much! I will try it! Thank you again!
It works. thank you!
No problem!
Hello, I want to run this tool for non-SICILIAN inputs,but I don't know what code to run, can you show me yours?Thanks!
Hello, I want to run this tool for non-SICILIAN inputs,but I don't know what code to run, can you show me yours?Thanks!
If I configure the .config file,Where should I modify the.config file and what code should I run?Thanks!
Hellow @wlei-amu, what kind of data do you want to run on? 10X cellranger BAMs?
Hellow @wlei-amu, what kind of data do you want to run on? 10X cellranger BAMs?
Dear juliaolivieri, I build SpliZ as following:
git clone https://github.com/salzmanlab/SpliZ.git
cd SpliZ
conda env create --name spliz_env --file=environment.yml
conda activate spliz_env
conda install nextflow
I have ran test data successfully via modifing small.config to set input_file = "small_data/small.pq"
.
Here, I wonder, if we run SpliZ using 10X cellranger BAMs, which config file shall we edit or generate? Can I justed modified the nextflow.config file as following:
// Global default params, used in configs
params {
// Workflow flags for SpliZ
// TODO nf-core: Specify your pipeline's command line flags
dataname = wx
input_file = wx_1.bam
SICILIAN = false
pin_S = 0.01
pin_z = 0.0
bounds = 5
light = false
svd_type = "normdonor"
n_perms = 100
grouping_level_1 = grouping_level_1
grouping_level_2 = grouping_level_2
libraryType = null
run_analysis = false
samplesheet = samplesheet.csv
annotator_pickle = hg38_refseq.pkl
exon_pickle = hg38_refseq_exon_bounds.pkl
splice_pickle = hg38_refseq_splices.pkl
meta = metadata.tsv
gtf = GRCh38_genomic.gtf
rank_quant = 0
outdir = './results/${params.dataname}'
publish_dir_mode = 'copy'
Or should I generate a new config file? If so, how shall I load the new config file.
Thanks a lot.
Hi, thanks for the great tool. I am trying to use it to solve some problems in my project. I have the 10x data and I used the cellranger to align them into the human ref. Finally, I got the bam file. So I want to configure the .config file. But I found it seems is not friendly to the input file exception the SICILIAN. I cannot how to write the input_file and meta file. Could you please give me some examples? I cannot understand the definition of "grouping_level_1 and grouping_level_2" and could you give me some explanation? Thank you in advance!