sdparekh / zUMIs

zUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs
GNU General Public License v3.0
275 stars 67 forks source link

additional sequences #373

Open ljy-sys opened 1 year ago

ljy-sys commented 1 year ago

Thanks for this tools to help us analysis the smartseq3 data,but when i using the parameter “additional_files” to add a exogenous sequence,the result without this sequence is not a umicount of 0, other expression of other genes was consistent with that without this sequence. so, I sincerely hope to get your help, thank you.

this is my .yaml file:

project: Q2Q4 sequence_files: file1: name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.R1.fastq.gz base_definition:

ljy-sys commented 1 year ago

Thanks for this tools to help us analysis the smartseq3 data,but when i using the parameter “additional_files” to add a exogenous sequence,the result without this sequence is not a umicount of 0, other expression of other genes was consistent with that without this sequence. so, I sincerely hope to get your help, thank you.

this is my .yaml file:

project: Q2Q4 sequence_files: file1: name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.R1.fastq.gz base_definition: - cDNA(23-150) - UMI(12-19) find_pattern: ATTGCGCAATG file2: name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.R2.fastq.gz base_definition: - cDNA(1-150) file3: name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.I1.fastq.gz base_definition: - BC(1-8) file4: name: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/0-merged_fq/Q2Q4.merged.I2.fastq.gz base_definition: - BC(1-8) reference: STAR_index: /storage1/project_raw/auto/test_with_CAR/reference/reference/star GTF_file: /storage1/project_raw/auto/test_with_CAR/reference/genes.gtf additional_STAR_params: '--limitSjdbInsertNsj 2000000 --clip3pAdapterSeq CTGTCTCTTATACACATCT' additional_files: /storage1/project_raw/auto/test_with_CAR/reference/CAR.fa out_dir: /storage1/project_raw/auto/test_with_CAR/zUMIs/smartseq3/Q2Q4 num_threads: 32 mem_limit: 100 filter_cutoffs: BC_filter: num_bases: 3 phred: 20 UMI_filter: num_bases: 3 phred: 20 barcodes: barcode_num: 96 barcode_file: /storage1/project_raw/auto/test_with_CAR/smartseq3/Q2Q4/barcode.txt automatic: no BarcodeBinning: 1 nReadsperCell: 100 counting_opts: introns: yes downsampling: '0' strand: 0 Ham_Dist: 1 velocyto: no primaryHit: yes twoPass: no make_stats: yes which_Stage: Filtering Rscript_exec: Rscript STAR_exec: STAR pigz_exec: pigz samtools_exec: samtools

and i got the file "additional_sequence_annot.gtf", and the geges.gtf contain the additional information.

cziegenhain commented 1 year ago

Hi,

Sorry I do not understand the issue, what do you mean with "the result without this sequence is not a umicount of 0"?

ljy-sys commented 1 year ago

Hi,

Sorry I do not understand the issue, what do you mean with "the result without this sequence is not a umicount of 0"?

En, this means that the inserted sequence does not appear in umicount statistics, and the gene_names.txt file also does not contain the gene.

cziegenhain commented 1 year ago

Could you confirm that the BAM file has any aligned reads for the added sequence from the fasta file? You could eg. run samtools idxstat on the .filtered.Aligned.GeneTagged.sorted.bam.bai for this.