meuleman / epilogos

Methods for summarizing and visualizing multi-biosample functional genomic annotations
https://epilogos.net
GNU General Public License v3.0
41 stars 5 forks source link

test run failed #2

Closed crazyhottommy closed 6 years ago

crazyhottommy commented 6 years ago

Hi,

I was trying the examples.


computeEpilogos_singleChromosomeSingleProcessor.sh data/chr1_127epigenomes_15observedStates.txt.gz 0 15 KL "33-34,37-45,47-48,61"

ls KL
chr1_Pnumerators.txt  p1a_chr1.stderr  p1a_chr1.stdout  p2_chr1.stderr  p2_chr1.stdout  Q.txt

cat p2_chr1.stderr 
Usage type #1:  /scratch/genomic_med/apps/epilogos/bin/computeEpilogosPart2_perChrom infile KLtype NsitesGenomewide infileQ outfileObs outfileQcat chr [infileQ2]
where
* infile holds the tab-delimited state or state pair IDs observed in the epigenomes or pairs of epigenomes, one line per genomic segment
* KLtype is either 0 (for KL), 1 (for KL*), or 2 (for KL**)
  KL compares states, KL* compares tallies of state pairs, and KL** compares state pairs of individual epigenome pairs
* NsitesGenomewide is the total number of sites observed genome-wide
* infileQ contains the Q, Q*, or Q** tally matrix (also see below)
* outfileObs will receive genomic coordinates (regions on chromosome "chr" of width regionWidth, starting at firstBegPos),
  the state (or state pair) making the largest contribution to the metric,
  the magnitude of that contribution, and the total value of the metric.
  If two groups are specified (see below), it will also include a column containing +/-1,
  specifying whether the first group (+1) or the second (-1) contributes more to the overall metric.
* outfileQcat will be in "qcat" format, uncompressed
* Optional additional argument infileQ2 can be used to specify Q, Q*, or Q** for a 2nd group of epigenomes,
  in which case the metric quantifies the difference (distance) between them.

Usage type #2:  /scratch/genomic_med/apps/epilogos/bin/computeEpilogosPart2_perChrom infile KLtype NsitesGenomewide infileQ1 infileQ2 outfileNulls
where
* infile contains random permutations of states (or state pairs) observed in the initial input data
* outfileNulls will receive the total difference metric for each line of permuted states (or state pairs)
* the remaining arguments are the same as described above
This second "usage type" is used to generate a null distribution, for estimating significance
of the metric values calculated via "usage type 1."

other files are empty except (60M in size) chr1_Pnumerators.txt. any ideas what's wrong? BTW, do you have an estimate on time and CPU usage for the examples?

Thanks! Tommy

meuleman commented 6 years ago

Thanks for your interest and feedback Tommy -- we're looking into this issue now.

crazyhottommy commented 6 years ago

do you have any updates? thanks!

erynes commented 6 years ago

What's in p1a_chr1.stderr? And what's in the .stdout files?

crazyhottommy commented 6 years ago

p1a_chr1.stderr and p1a_chr1.stdout are empty

cat Q.txt 
96366   142716  19653   748567  1816813 93142   502546  14882   135080  6219    13877   27887   174821  2009257 11645716 ...
erynes commented 6 years ago

Ah, I see what the problem is. I had changed the input requirements of one of the binaries to make it flexible with respect to genomic coordinates, and I changed the main (multi-processor) epilogos script to reflect this, but I didn't make a corresponding change in the single-processor script. I'll open a bug, then fix and close it. Once that's done, pick up the fix, and it should work for you. Apologies.

crazyhottommy commented 6 years ago

Thanks for catching it! let me know when you have it fixed.

crazyhottommy commented 6 years ago

just pull the latest and looks like is working. thanks!