vlink / marge

4 stars 6 forks source link

MMARGE.pl denovo_motifs error #5

Closed aciernia closed 2 weeks ago

aciernia commented 5 years ago

Hi!

I am trying to examine de novo motifs in a set of ATACseq peak bed files compared to a background file using MMARGE.pl denovo_motifs. I have run into the following error when the shifting vectors are being loaded: Intervals must have positive width at /afs/.genomecenter.ucdavis.edu/software/mmarge/1.0/lssc0-linux//bin/analysis_tree.pm line 1079. (see below). My bed file runs fine with HOMER findMotifsGenome.pl, so I don't think it is a problem with the input peaks. Please let me know what I should do to fix this!

Thanks,

Annie

MMARGE.pl denovo_motifs BTBRmedia_greater_C57media.bed mm10 mmarge_denovomotifs_BTBRmedia_greater_C57media -bg Allconsensuspeaks.bed -fg_strain BTBR_T+_ITPR3TF_J -bg_strain C57BL6J -size given -p 30 -data_dir /share/lasallelab/Annie/BMDM/ATACanalysis_4_2018/BTBR_genome -genome_dir /share/lasallelab/Annie/BMDM/ATACanalysis_4_2018/BTBR_genome

Position file = BTBRmedia_greater_C57media.bed
Genome = mm10
Output Directory = mmarge_denovomotifs_BTBRmedia_greater_C57media
background position file: Allconsensuspeaks.bed
Using actual sizes of regions (-size given)
Fragment size set to given
Using 30 CPUs

/share/lasallelab/Annie/BMDM/ATACanalysis_4_2018/BTBR_genome/C57BL6J/preparsed/ Found mset for "mouse", will check against vertebrates motifs Peak/BED file conversion summary: BED/Header formatted lines: 37049 peakfile formatted lines: 0

Peak File Statistics:
    Total Peaks: 37049
    Redundant Peak IDs: 0
    Peaks lacking information: 0 (need at least 5 columns per peak)
    Peaks with misformatted coordinates: 0 (should be integer)
    Peaks with misformatted strand: 0 (should be either +/- or 0/1)

Peak file looks good!

Peak/BED file conversion summary:
    BED/Header formatted lines: 129420
    peakfile formatted lines: 0

remove overlapping target and background positions Max distance to merge: direct overlap required (-d given) Calculating co-bound peaks relative to reference: mmarge_denovomotifs_BTBRmedia_greater_C57media/bg.clean.pos

Comparing peaks: (peakfile, overlapping peaks, logRatio(obs/expected), logP)
    mmarge_denovomotifs_BTBRmedia_greater_C57media/target.clean..pos    37051   3.13    0.00

Co-bound by 0 peaks: 92369
Co-bound by 1 peaks: 37051 (max: 37051 effective total)

mv mmarge_denovomotifs_BTBRmedia_greater_C57media/0.410142421097593.2.tmp.coBoundBy0.txt mmarge_denovomotifs_BTBRmedia_greater_C57media/bg.clean.pos Saving peaks This it is: mmarge_denovomotifs_BTBRmedia_greater_C57media/targetgiven.pos Loading shift vectors Intervals must have positive width at /afs/.genomecenter.ucdavis.edu/software/mmarge/1.0/lssc0-linux//bin/analysis_tree.pm line 1079. readline() on closed filehandle IN at /software/homer/4.9/lssc0-linux/bin/cleanUpSequences.pl line 31. rm: cannot remove 'mmarge_denovomotifs_BTBRmedia_greater_C57media/0.410142421097593.tmp': No such file or directory Not removing redundant sequences

Sequences processed:
    0 total

Here we do calculate targetCGBins Frequency Bins: 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.6 0.7 0.8 Freq Bin Count Illegal division by zero at /software/homer/4.9/lssc0-linux/bin/assignGeneWeights.pl line 63. Normalizing lower order oligos using homer2

vlink commented 5 years ago

Hi!

It seems to be a problem with the way MMARGE generates the genome. Could you share your VCF genome file for the BTBR_T+_ITPR3TF_J genome so I can try to trace back the error?

Thanks!

On 1/7/19, 2:15 PM, "Annie Vogel Ciernia" notifications@github.com<mailto:notifications@github.com> wrote:

Hi!

I am trying to examine de novo motifs in a set of ATACseq peak bed files compared to a background file using MMARGE.pl denovo_motifs. I have run into the following error when the shifting vectors are being loaded: Intervals must have positive width at /afs/.genomecenter.ucdavis.edu/software/mmarge/1.0/lssc0-linux//bin/analysis_tree.pm line 1079. (see below). My bed file runs fine with HOMER findMotifsGenome.pl, so I don't think it is a problem with the input peaks. Please let me know what I should do to fix this!

Thanks,

Annie

MMARGE.pl denovo_motifs BTBRmedia_greater_C57media.bed mm10 mmarge_denovomotifs_BTBRmedia_greater_C57media -bg Allconsensuspeaks.bed -fg_strain BTBR_T+_ITPR3TF_J -bg_strain C57BL6J -size given -p 30 -data_dir /share/lasallelab/Annie/BMDM/ATACanalysis_4_2018/BTBR_genome -genome_dir /share/lasallelab/Annie/BMDM/ATACanalysis_4_2018/BTBR_genome

Position file = BTBRmedia_greater_C57media.bed

Genome = mm10

Output Directory = mmarge_denovomotifs_BTBRmedia_greater_C57media

background position file: Allconsensuspeaks.bed

Using actual sizes of regions (-size given)

Fragment size set to given

Using 30 CPUs

/share/lasallelab/Annie/BMDM/ATACanalysis_4_2018/BTBR_genome/C57BL6J/preparsed/ Found mset for "mouse", will check against vertebrates motifs Peak/BED file conversion summary: BED/Header formatted lines: 37049 peakfile formatted lines: 0

Peak File Statistics:

    Total Peaks: 37049

    Redundant Peak IDs: 0

    Peaks lacking information: 0 (need at least 5 columns per peak)

    Peaks with misformatted coordinates: 0 (should be integer)

    Peaks with misformatted strand: 0 (should be either +/- or 0/1)

Peak file looks good!

Peak/BED file conversion summary:

    BED/Header formatted lines: 129420

    peakfile formatted lines: 0

remove overlapping target and background positions Max distance to merge: direct overlap required (-d given) Calculating co-bound peaks relative to reference: mmarge_denovomotifs_BTBRmedia_greater_C57media/bg.clean.pos

Comparing peaks: (peakfile, overlapping peaks, logRatio(obs/expected), logP)

    mmarge_denovomotifs_BTBRmedia_greater_C57media/target.clean..pos     37051   3.13        0.00

Co-bound by 0 peaks: 92369

Co-bound by 1 peaks: 37051 (max: 37051 effective total)

mv mmarge_denovomotifs_BTBRmedia_greater_C57media/0.410142421097593.2.tmp.coBoundBy0.txt mmarge_denovomotifs_BTBRmedia_greater_C57media/bg.clean.pos Saving peaks This it is: mmarge_denovomotifs_BTBRmedia_greater_C57media/targetgiven.pos Loading shift vectors Intervals must have positive width at /afs/.genomecenter.ucdavis.edu/software/mmarge/1.0/lssc0-linux//bin/analysis_tree.pm line 1079. readline() on closed filehandle IN at /software/homer/4.9/lssc0-linux/bin/cleanUpSequences.pl line 31. rm: cannot remove 'mmarge_denovomotifs_BTBRmedia_greater_C57media/0.410142421097593.tmp': No such file or directory Not removing redundant sequences

Sequences processed:

    0 total

Here we do calculate targetCGBins Frequency Bins: 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.6 0.7 0.8 Freq Bin Count Illegal division by zero at /software/homer/4.9/lssc0-linux/bin/assignGeneWeights.pl line 63. Normalizing lower order oligos using homer2

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/vlink/marge/issues/5, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AD8p6DahJrLbUL-xRnLBmaHoV-F9DWikks5vA5ntgaJpZM4Z0Df-.

aciernia commented 5 years ago

I am more than happy to share the files. What is the best way to transfer them to you? They are 1.2 GB for the snps and ~350MB for the indels. Thanks so much!

vlink commented 5 years ago

Can you upload them somewhere so I can download them? You can send the link directly to my email (verena.m.link@gmail.commailto:verena.m.link@gmail.com) if you don’t want to share your data publicly.

Best, Verena

On 1/8/19, 12:16 PM, "Annie Vogel Ciernia" notifications@github.com<mailto:notifications@github.com> wrote:

I am more than happy to share the files. What is the best way to transfer them to you? They are 1.2 GB for the snps and ~350MB for the indels. Thanks so much!

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/vlink/marge/issues/5#issuecomment-452379119, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AD8p6Gb0-9Z6CktX-VV4VT3m2Y5ckmM3ks5vBNJmgaJpZM4Z0Df-.

aciernia commented 5 years ago

Terrific. I just sent links to your email. Please let me know if you didn't receive them. I really appreciate all the help!!!

PhrenoVermouth commented 2 years ago

Facing the same issue in PWK strain. Since two years have passed, any possible suggestions? Thanks a lot!