MGI-tech-bioinformatics / DNBelab_C_Series_HT_scRNA-analysis-software

An open source and flexible pipeline to analysis high-throughput DNBelab C Series single-cell RNA datasets
MIT License
52 stars 21 forks source link

Error in DNBC4tools run #17

Closed Shi-YuZhang closed 1 year ago

Shi-YuZhang commented 1 year ago

Thanks for the good tool. But I met an error when running DNBC4tools run: Snipaste_2023-03-07_21-44-44 Snipaste_2023-03-07_21-42-16 Snipaste_2023-03-07_21-43-18

lishuangshuang0616 commented 1 year ago

image some problems in fastq? @Shi-YuZhang

Shi-YuZhang commented 1 year ago

I had multiple .fq.gz files in L01/2/3/4 directory respectively and I merged them using cat XXX/L01/*_1.fq.gz > XXX_L01_1.fq.gz or cat XXX/L01/*_2.fq.gz > XXX_L01_2.fq.gz. I'm not sure if this is right. image

lishuangshuang0616 commented 1 year ago

You seem to have merged *_undecoded_1.fq.gz into it. This fastq sequence does not cut the index and is a little longer than the others. @Shi-YuZhang

Shi-YuZhang commented 1 year ago

You mean I should not merge *_undecoded_1.fq.gz in every lane in cDNA directory and Oligo directory?

lishuangshuang0616 commented 1 year ago

Does your cDNA and oligo library have no corresponding barcode, do you need all the fastq in the lane? If it is, then should not merge undecoded fastq.

Shi-YuZhang commented 1 year ago

Sorry, I am a novice in processing data.Can you tell me how to confirm whether there is a corresponding barcode? And can you explain when I need all fastqs and when i do not?Thanks in advance.

lishuangshuang0616 commented 1 year ago

您好,样本构建文库时都会加上index来区分样本,下机的测序数据需要靠index拆分来知道每个fastq对应的是哪个样本。我不确定你是否需要将一条lane里面所有的fastq合并当作一个样本的数据(如果是,合并时也不要将undecoded.fq.gz合并进去,这个是拆分时没有拆出来的文库,R2的测序长度比有index的文库长)。通常,一个样本会对应一个index或多个index,只需要使用对应index的fastq来进行分析。先确认你的样本对应的index,然后找到该index的fastq用于分析。 @Shi-YuZhang

Shi-YuZhang commented 1 year ago

Thanks, it works now.