sandberg-lab / dataprivacy

GNU General Public License v3.0
14 stars 4 forks source link

output bam file smaller than original bam file #3

Closed sropri closed 1 year ago

sropri commented 1 year ago

Hi Guys,

I run BAMboozle on my original bam file which is 2.9 GiB but my output file is 7.5 MiB. Shouldn't the output file be the same size? Is something wrong with my reference genome. I using the GRCh37 Human build fasta sequences, which is also 3.1 GiB. I am confused on this and your help will be appreciated

cziegenhain commented 1 year ago

Hi!

Yes the output indeed seems very small. My first guess is that the chromosome names do not match between the provided fasta and the reference names present in the bam file, in which case the aligned reads that cannot be bamboozled will be discarded.

Best, Christoph

sropri commented 1 year ago

Hi,

Thank you for your response and you are correct. I was able to get the correct reference genome file fasta that I aligned the fastq files with and it worked perfectly. I appreciate you taking the time to help me in this.

Ali


From: cziegenhain @.> Sent: Monday, May 22, 2023 10:04:12 PM To: sandberg-lab/dataprivacy @.> Cc: Ropri, Ali S @.>; Author @.> Subject: Re: [sandberg-lab/dataprivacy] output bam file smaller than original bam file (Issue #3)

Hi!

Yes the output indeed seems very small. My first guess is that the chromosome names do not match between the provided fasta and the reference names present in the bam file, in which case the aligned reads that cannot be bamboozled will be discarded.

Best, Christoph

— Reply to this email directly, view it on GitHubhttps://github.com/sandberg-lab/dataprivacy/issues/3#issuecomment-1558385941, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AR2WCE7CY6PN62EHWD46IXTXHQLJZANCNFSM6AAAAAAYH4ZTO4. You are receiving this because you authored the thread.Message ID: @.***>

cziegenhain commented 1 year ago

Perfect, glad it works now!