MGI-tech-bioinformatics / DNBelab_C_Series_HT_scRNA-analysis-software

An open source and flexible pipeline to analysis high-throughput DNBelab C Series single-cell RNA datasets
MIT License
65 stars 24 forks source link

关于参考基因组添加marker基因的问题 #50

Closed hejian41 closed 5 months ago

hejian41 commented 9 months ago

感谢作者提供dnbc4tools工具!我们尝试在构建参考基因组中添加荧光报告基因序列,在fasta中添加报告序列以及gtf中添加符合gencode格式的基因注释信息后,通过mkref生成了参考基因组(geneInfo.tab文件中含有目标基因),但是在run完后,产生的结果中features.tsv文件中没有目标基因。请问是在哪一步filter掉了吗?还是需要特殊的处理方式?期待您的回复,谢谢!

lishuangshuang0616 commented 9 months ago

You only need to add the corresponding information in fasta and gtf respectively. If there is no such gene in features.tsv, then it may be that the gene is not expressed (cellranger will display all genes, dnbc4tools will only display expressed genes), or the sequence and genome The presence of relatively high similarity causes multi mapping to be filtered during comparison.

hejian41 commented 9 months ago

Thanks for the reply. I am wondering if the fluorescent gene sequence supplemented in fasta file should include the 3'UTR or polyA nts, for the RT reaction during the library construction only captures the mRNAs with polyA tail. In fact, fluorescent protein can be detected in the same cells, which means the gene was expressed.

lishuangshuang0616 commented 9 months ago

Determine whether this sequence is aligned in _01.data/finalsort.bam

hejian41 commented 9 months ago

Thanks for suggestion. I have checked the final_sort.bam file and did not find any alignment of the sequence. Does this mean that I submitted the wrong sequence?

lishuangshuang0616 commented 9 months ago

This problem may exist. If you are not sure, you can download the new version of STAR and compare the genome with R2 data.

hejian41 commented 9 months ago

OK, I will try it. Thks