jsh58 / Genrich

Detecting sites of genomic enrichment
MIT License
182 stars 27 forks source link

Genrich on bed derived bam file #85

Closed emoiso closed 4 months ago

emoiso commented 2 years ago

Hi John, How are you?

I was able to successfully run Genrich on a bam file (as follow).

$ samtools view file1.bam  | head -n2 
A01071:42:HLFVHDSXY:2:1101:1054:21104   99  chr12   131748536   255 147M3S  =   131748536   147 CCCCTCTAAGAACCTTCTCAGCACAGGAGTTCTGTTCTGTGTGTTAAATTCACACGAGATTAAGATCATGCAGAGATACGAGAGAACTGGCTCTGATTTTTGCAAGAAGCCAGTTGAATAGAGGGCCTTGGGAGATAATTAGGCAGACCC  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFF  NH:i:1  HI:i:1  AS:i:288    NM:i:1  MD:Z:2T144
A01071:42:HLFVHDSXY:2:1101:1054:21104   147 chr12   131748536   255 147M3S  =   131748536   -147    CCCCTCTAAGAACCTTCTCAGCACAGGAGTTCTGTTCTGTGTGTTAAATTCACACGAGATTAAGATCATGCAGAGATACGAGAGAACTGGCTCTGATTTTTGCAAGAAGCCAGTTGAATAGAGGGCCTTGGGAGATAATTAGGCAGACCC  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  NH:i:1  HI:i:1  AS:i:288    NM:i:1  MD:Z:2T144

$ Genrich -t file1.bam -o file1.narrowPeak -v
Processing experimental file #0 file1.bam
  BAM records analyzed:     1546996
    Unmapped:                661896
    Paired alignments:       885100
      secondary alns:        233130
    Unpaired alignments:          0
  Fragments analyzed:        325985
    Full fragments:          325985
      (avg. length: 4092.2bp)
    Half fragments:               0
- control file #0 not provided -
  Background pileup value: 0.408312
Peak-calling parameters:
  Genome length: 3267117988bp
  Significance threshold: -log(p) > 2.000
  Min. AUC: 200.000
  Max. gap between sites: 100bp
Peaks identified: 2828 (100707518bp)

$ head -n2 file1.narrowPeak 
chr1    132592  756094  peak_0  1000    .   1321089.125000  7.969894    -1  501409
chr1    2410431 2412564 peak_1  311 .   663.617493  2.311119    -1  1066

Great success, great software, thank you.

Now I'd like to call peak on a bam file, derived from a bed file. Is it even possible? I follow this procedure:

$ head -n2 file2.bed
chr19   3051851 3051852 1705553/m64018_210224_164809.2  .   5.31
chr19   3051941 3052004 1705553/m64018_210224_164809.3  .   5.31

$ bedtools bedtobam -i file2.bed -g genome.file | samtools sort -n > file2.bam

$ samtools view file2.bam | head -n2
250/m64018_210224_164809.2  0   chr19   3604412 255 1M  *   0   0   *   *
250/m64018_210224_164809.3  0   chr19   3604526 255 47M *   0   0   *   *

$ Genrich -t file2.bam -o file2.narrowPeak -vy
Processing experimental file #0: file2.bam
  BAM records analyzed:     2279978
    Paired alignments:            0
    Unpaired alignments:    2279978
  Fragments analyzed:       2279978
    Full fragments:               0
    Half fragments:         2279978
      (from unpaired alns)
Error! Experimental sample has no analyzable fragments

I assume the failure is caused by missing information/fields in my bam file. Can you help understand which are the required fields that are misssing, please?

Thank you in advance Best, Enrico

jsh58 commented 2 years ago

Thanks for the question. It is better to run Genrich on an original BAM file, but there shouldn't be a problem calling peaks from a valid BAM file derived from a BED file. Can you please post the first few lines of the BAM header? samtools view -H file2.bam | head -n5

emoiso commented 2 years ago

Thank you for the quick response.

This is the header:

$ samtools view -H file2.bam | head -n5
@HD VN:1.0  SO:coordinate
@PG ID:BEDTools_bedToBam    VN:Vv2.29.2
@PG ID:samtools PN:samtools PP:BEDTools_bedToBam    VN:1.10 CL:samtools sort
@PG ID:samtools.1   PN:samtools PP:samtools VN:1.10 CL:samtools view -H file2.bam
@SQ SN:chr1 AS:genome.file  LN:195154279
jsh58 commented 2 years ago

Thanks! I do not see a problem with the BAM header though. Could you attach the BAM file here? Or better yet, a small part of it that still gives the no analyzable fragments error. samtools view -h file2.bam | head -n1000 > file2_1k.sam