brentp / bwa-meth

fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome
https://arxiv.org/abs/1401.1129
MIT License
139 stars 53 forks source link

Broken pipe at samtools index #12

Closed will-NYGC closed 9 years ago

will-NYGC commented 10 years ago

Hi,

Can you help me udnerstand this error:

... running: samtools index Sample_CGATGT_AC4190ANXX_L004_001.bam [fputs] Broken pipe

Seems like the BAM is ok and indexing just failed but want to be absolutely sure. Any ideas?

thanks for your help.

superbobry commented 10 years ago

Does calling index manually work for you?

will-NYGC commented 10 years ago

yes manual seems to work fine and there’s an index created that is the appropriate size.

On Jul 6, 2014, at 12:18 PM, Sergei Lebedev notifications@github.com<mailto:notifications@github.com> wrote:

Does calling index manually work for you?

— Reply to this email directly or view it on GitHubhttps://github.com/brentp/bwa-meth/issues/12#issuecomment-48115962.

brentp commented 10 years ago

This should be resolved if you update toolshed

will-NYGC commented 9 years ago

thanks brent, that did the trick. I was able to pass the aligned BAMs through some qc collection with picard without incident. However, I'm having some issues with "bwameth.py tabulate" now--maybe they're more issues with Bis-SNP but hoping you might have insight. First, I may have missed it but is this script aware of the markDuplicate flag? and does the tabulation take into consideration overlapping read pairs? Second, I'm getting very crazy methylation %'s, particularly at non-CpG's--of which very few are present in the output BED file. Do you have any thoughts as to where things went wrong? I've run the bismark pipeline in parallel and that gives <5% non-CpG methylation as I expected. Finally, is there a way to output a single BED from a BAM comprised of multiple read groups?

thanks for your help.

Methylation summary in total:

CG: 1549660 87.476% CH: 24742264 54.139% C: 28382817 56.085%

Methylation summary in Read Group: readGroup1

C: 28384634 56.212% CG: 1549183 87.488% CH: 24768665 54.271%

Methylation summary in Read Group: readGroup2

C: 28381000 55.959% CG: 1550137 87.464% CH: 24715863 54.007%

brentp commented 9 years ago
  1. Could you try using scripts/tabulate-methylation.py instead of the BisSNP wrapper?
  2. Yes, BisSNP (and the script mentioned in 1.) will ignore duplicates.
  3. what is the command you used to generate that data and where are the summaries coming from? (I don't understand how they would not match the BED file). What is the stderr of the command you used to generate that data?
  4. For making a single BED, you can just do addition on the per-read-group BED files.