matsen / pplacer

Phylogenetic placement and downstream analysis
http://matsen.fredhutch.org/pplacer/
GNU General Public License v3.0
74 stars 17 forks source link

Add quadratic entropy #199

Closed cmccoy closed 12 years ago

cmccoy commented 12 years ago

guppy pentropy should be renamed guppy entropy, with two entropy measures:

The original formula is for quadratic entropy is:

Q = \sum_{i<j} p_i p_j \delta_{ij}

Where p_i, p_j are the proportion of mass on taxon i and j respectively, and \delta_{ij} is the phylogenetic distance between them.

To make things a little faster, this can be rearranged so that:

Q = \sum_{s in S} l(s) (\sum_{i in R_s} p_i}) (\sum_{j in N_s} p_j)

Where S contains all snippets on the tree between placements, R_s contains all placements on the proximal side of snippet s, and N_s contains all placements on the distal side of s, and p_i contains the proportion of the total mass on on placement i.

matsen commented 12 years ago

Great. Let's have the two columns be named "phylogenetic" and "quadratic".

The docs should include these descriptions in the docs. They can be specified in RST like in guppy_kr.rst:

.. math::
    Z(P,Q) =
    \int_T \left| P(\tau(y)) - Q(\tau(y)) \right| \, \lambda(dy).

On Fri, Dec 23, 2011 at 10:04 AM, Connor McCoy reply@reply.github.com wrote:

guppy pentropy should be renamed guppy entropy, which two entropy measures:

  • phylogenetic entropy (see #189)
  • quadratic entropy, described in a phylogenetic context in this paper.

The original formula is for quadratic entropy is:

   Q = \sum_{i<j} p_i pj \delta{ij}

Where p_i, pj are the proportion of mass on taxon i and j respectively, and `\delta{ij}` is the phylogenetic distance between them.

To make things a little faster, this can be rearranged so that:

   Q = \sum{s in S} l(s) (\sum{i in R_s} pi}) (\sum{j in N_s} p_j)

Where S contains all snippets on the tree between placements, R_s contains all placements on the proximal side of snippet s, and N_s contains all placements on the distal side of s.


Reply to this email directly or view it on GitHub: https://github.com/matsen/pplacer/issues/199

Frederick "Erick" Matsen, Assistant Member Fred Hutchinson Cancer Research Center http://matsen.fhcrc.org/