heathsc / gemBS

gemBS is a bioinformatics pipeline designed for high throughput analysis of DNA methylation from Whole Genome Bisulfite Sequencing data (WGBS).
GNU General Public License v3.0
32 stars 21 forks source link

mapping merge fails if one of the file_id's is equal to the Barcode #94

Open IsmailM opened 2 years ago

IsmailM commented 2 years ago

If I have the following samplesheet:

Barcode,File_id,end_1,end_2
S001,S001,sample1_R1.fq.gz,sample1_R2.fq.gz
S001,S001_1,sample1_1_R1.fq.gz,sample1_1_R2.fq.gz

(i.e. in cases with multiple rows for single barcode, one of the file_id is equal to the barcode):

Then during gembs map, the merging does not work as expected.

Specifically, no index / md5 files are produced, and I suspect that the merging doesn't actually happen.

i.e. During the first step of gembs map, it generates S001.bam + S001_1.bam, which it would normally try to merge into a single file named S001.bam. But because this file already exists, it does not run the merge (and hence the index/md5 files are missing...