ComparativeGenomicsToolkit / hal

Hierarchical Alignment Format
Other
154 stars 40 forks source link

Missing negative values in conservation tracks #57

Open nickmachnik opened 6 years ago

nickmachnik commented 6 years ago

I am trying to run hal2assemblyHub in order to obtain conservation tracks for each of the genomes in my .hal alignment file:

hal2assemblyHub.py dro.hal conservation --conservation dmel-all-r6.16_cactusNames.gtf_CDS.bed --conservationGenomeName Dmel --defaultCpu 24

I get all the expected GenomeX_phyloP.bw files. However, all the values inside the .bw files are >= 0 (interestingly, some are -0). I assume this is the default output of phyloP, when it is run with --mode CON, in which case large values (= small p values) indicate conservation. Is there a way to make halPhyloP use --mode CONACC, so that I can get some information about acceleration, too? Or does halPhyloP already run phyloP with --mode CONACC, which would mean that there is simply no indication of acceleration in my data?

glennhickey commented 6 years ago

The mode is set to CONACC by default

https://github.com/ComparativeGenomicsToolkit/hal/blob/master/phyloP/impl/halPhyloPMain.cpp#L180-L181

so I don't think that's the issue.

On Wed, Aug 30, 2017 at 12:24 PM, NMachnik notifications@github.com wrote:

I am trying to run hal2assemblyHub in order to obtain conservation tracks for each of the genomes in my .hal alignment file:

hal2assemblyHub.py dro.hal conservation --conservation dmel-all-r6.16_cactusNames.gtf_CDS.bed --conservationGenomeName Dmel --defaultCpu 24

I get all the expected GenomeX_phyloP.bw files. However, all the values inside the .bw files are >= 0 (interestingly, some are -0). I assume this is the default output of phyloP, when it is run with --mode CON, in which case large values (= small p values) indicate conservation. Is there a way to make halPhyloP use --mode CONACC, so that I can get some information about acceleration, too? Or does halPhyloP already run phyloP with --mode CONACC, which would mean that there is simply no indication of acceleration in my data?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ComparativeGenomicsToolkit/hal/issues/57, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2_7q3f7-h3L5jyrL8R1ea8MWilHj3Sks5sdYzYgaJpZM4PHpQt .

nickmachnik commented 6 years ago

Alright, thank you for the fast answer!

nickmachnik commented 6 years ago

Just figured out that the ancestral .bw files (Anc00_phyloP.bw, Anc01_phyloP.bw, ...) do indeed have negative and positive values. I was excluding them from the analysis before. None of the files for the actual input genomes has negative values though. Curious.