gamcil / clinker

Gene cluster comparison figure generator
MIT License
518 stars 69 forks source link

Possible to add phylogeny to clinker output plot? #15

Closed jelber2 closed 3 years ago

jelber2 commented 3 years ago

Hi, I was wondering about if it would be possible to add phylogeny information to a clinker plot.

So for example let's say you have some gene clusters made with the GenBank flat files (gbff) of 37 birds

$ ls *.gbff|perl -pe "s/_/ /g"|perl -pe "s/.gbff//g" |perl -pe "s/Strigops habroptila/Strigops greyii/g" > names

$ cat names
Acanthisitta chloris
Anas platyrhynchos
Aptenodytes forsteri
Apteryx rowi
Calidris pugnax
Camarhynchus parvulus
Catharus ustulatus
Charadrius vociferus
Chiroxiphia lanceolata
Columba livia
Corapipo altera
Cyanistes caeruleus
Cygnus atratus
Dromaius novaehollandiae
Egretta garzetta
Falco peregrinus
Ficedula albicollis
Gallus gallus
Geospiza fortis
Haliaeetus leucocephalus
Manacus vitellinus
Meleagris gallopavo
Neopelma chrysocephalum
Nipponia nippon
Nothoprocta perdicaria
Oxyura jamaicensis
Parus major
Phasianus colchicus
Pipra filicauda
Pseudopodoces humilis
Pygoscelis adeliae
Serinus canaria
Strigops greyii
Sturnus vulgaris
Taeniopygia guttata
Tauraco erythrolophus
Zonotrichia albicollis

and then you use the R package rotl to extract one phylogenetic hypothesis for the 37 bird species

> library(rotl)
> taxa<-tnrs_match_names(names= c(scan("names",what="",sep='\n')))
> my_tree <- tol_induced_subtree(ott_ids = taxa$ott_id, label_format="name")
> png(filename = "birds.png",width=600,height=600)
> plot(my_tree, no.margin = TRUE)
> dev.off()
> tol_induced_subtree(ott_ids = ott_id(taxa), file="ghrl.newick.txt", label_format="name")

and you get the following birds.png

birds

and you also get a Newick tree file ghrl.newick.txt that enscapulates the relationship in birds.png in text format

Might it be possible to pass ghrl.newick.txt to clinker to generate something similar to birds.png?

gamcil commented 3 years ago

This is definitely on the to-do list but is not possible at the moment, I'm afraid. In the meantime, maybe you can try adjusting the spacing between each leaf on the tree to line them up to the cluster figure produced by clinker. I'm working on a little tree visualisation that should be able to slot in to the clinker output, but it's still very much a WIP and I'm not sure when I'll be able to finish it.

jelber2 commented 3 years ago

Oh cool- I look forward to when clinker gets available to test those features! With 37 species though, it can be a little frustrating to manually sort individual labels/gene tracks. At least with Firefox and the -p output, there can be some times when a label gets moved when other labels get moved. I haven't tried it on Google Chrome browser. A simple fix might be to force the order of labels/gene tracks beforehand as input with clinker then just manually edit on the phylogeny. Is that more doable?

jelber2 commented 3 years ago

Just following up if it might be possible to add an option to force output order based on some input file?

gamcil commented 3 years ago

Yup, working on it + some other things for the next release. Shouldn't be too long

jelber2 commented 3 years ago

Super cool! Looking forward to it!

gamcil commented 3 years ago

You can now do this in clinker 0.0.7 with the -ufo/--use_file_order flag, which will force clinker to display clusters in the order they are specified.

e.g. clinker one.gbk three.gbk two.gbk -ufo -p

You can do this from a file, e.g.:

one.gbk
three.gbk
two.gbk

By doing something like:

clinker $(cat file.txt) -ufo -p
jelber2 commented 3 years ago

Thanks!

jelber2 commented 3 years ago

Just noticed that there are some d3 related phylogeny tools (see https://www.jasondavies.com/tree-of-life/) and even a Newick tree parser. Not sure how to easily incorporate it into clinker though.

See html code below for an example using newick.js and d3.phylogram.js

<!DOCTYPE html>
<html lang='en' xml:lang='en' xmlns='http://www.w3.org/1999/xhtml'>
  <head>
    <meta content='text/html;charset=UTF-8' http-equiv='content-type'>
    <title>Right-angle phylograms and dendrograms with d3</title>
    <script src="https://d3js.org/d3.v3.min.js" type="text/javascript"></script>
    <script src="http://bl.ocks.org/kueda/raw/1036776/newick.js" type="text/javascript"></script>
    <script src="http://bl.ocks.org/kueda/raw/1036776/d3.phylogram.js" type="text/javascript"></script>
    <script>
      function load() {
        var newick = Newick.parse("(((((((((((((((((((((((((((((((((((((((((((((Ficedula_albicollis)mrcaott22300ott107840)mrcaott22300ott416089)mrcaott22300ott629342)mrcaott22300ott416087)mrcaott22300ott3598245)mrcaott22300ott130294)mrcaott22300ott67150)mrcaott22300ott909199)mrcaott22300ott547548)mrcaott22300ott35350)mrcaott1566ott22300)mrcaott1566ott113980)mrcaott1566ott24297)mrcaott1566ott32651)mrcaott1566ott35326,(((((((Catharus_ustulatus)mrcaott252688ott489372)Catharus)mrcaott252687ott288678)mrcaott252687ott712913)mrcaott252687ott775708)mrcaott19467ott252687)mrcaott19467ott431648)mrcaott1566ott19467,((((((Sturnus_vulgaris)mrcaott366470ott565813)mrcaott2224ott366470)mrcaott2175ott2224)mrcaott2175ott59905)mrcaott2175ott259082)mrcaott2175ott968664)mrcaott1566ott2175)mrcaott1566ott496009)mrcaott1566ott3598440)mrcaott246ott1566)mrcaott246ott5934,(((((((((((((Taeniopygia_guttata)Taeniopygia)mrcaott311555ott445491)mrcaott311555ott1082386)mrcaott105913ott311555)mrcaott24017ott105913)mrcaott4083ott24017)mrcaott4083ott52094)mrcaott4083ott11712,((((((((((((((((((((((((((((((Camarhynchus_parvulus)mrcaott419367ott963505)Camarhynchus,((((Geospiza_fortis)mrcaott589951ott5991469)mrcaott589951ott589956)mrcaott589951ott589953)Geospiza)mrcaott419363ott589951)mrcaott419363ott992256)mrcaott419363ott589943)mrcaott419363ott527327)mrcaott419363ott589949)mrcaott419363ott422527)mrcaott349672ott419363)mrcaott4088ott349672)mrcaott4088ott349675)mrcaott4088ott884086)mrcaott4088ott90856)mrcaott4088ott227256)mrcaott4088ott273602)mrcaott4088ott7060)mrcaott4088ott20532)mrcaott4088ott89879)mrcaott4088ott3599764)mrcaott4088ott20545)mrcaott4088ott6566)mrcaott4088ott365553)mrcaott4088ott9377)mrcaott4088ott23955)mrcaott4088ott420995,(((((((((((Zonotrichia_albicollis)mrcaott265547ott265552)mrcaott125079ott265547)'Zonotrichia (genus in domain Eukaryota)')mrcaott125079ott765405)mrcaott125079ott463026)mrcaott6023ott125079)mrcaott6023ott101225)mrcaott6023ott243614)mrcaott5616ott6023)mrcaott5616ott28339)mrcaott5616ott5620)mrcaott4088ott5616,(((((((((((((Serinus_canaria)mrcaott238137ott464865)mrcaott238137ott328909)mrcaott6375ott238137)mrcaott6375ott119724)mrcaott6366ott6375)mrcaott6366ott238142)mrcaott6366ott405215)mrcaott6366ott641497)mrcaott6366ott178457)mrcaott6366ott157599)mrcaott6366ott341465)mrcaott6366ott88283)mrcaott6366ott28332)mrcaott4088ott6366)mrcaott4088ott8371)mrcaott4088ott9416)mrcaott4088ott95302)mrcaott4083ott4088)mrcaott4083ott370807)mrcaott4083ott35042)mrcaott3364ott4083)mrcaott3364ott73828)mrcaott246ott3364,(((((((((Cyanistes_caeruleus)mrcaott123763ott258794)Cyanistes,((((((Parus_major)mrcaott84656ott492911)Parus,(Pseudopodoces_humilis)Pseudopodoces)mrcaott84656ott875992)mrcaott84656ott325806)mrcaott84656ott325811)mrcaott84656ott5925750)mrcaott61147ott84656)mrcaott2375ott61147)mrcaott2375ott814750)mrcaott2375ott71358)mrcaott2375ott73144)mrcaott1488ott2375)mrcaott1488ott72472)mrcaott246ott1488)mrcaott246ott10351)mrcaott246ott176461)mrcaott246ott22325)mrcaott246ott4820)mrcaott246ott32658)mrcaott246ott5929)mrcaott246ott44866)mrcaott246ott428578,((((((((((Chiroxiphia_lanceolata)mrcaott25827ott590809,(((Corapipo_altera)mrcaott775796ott1061608)mrcaott319674ott775796)mrcaott134638ott319674)mrcaott25827ott134638)mrcaott25827ott485120,((((((Pipra_filicauda)mrcaott946233ott982239)mrcaott128718ott946233,(((Manacus_vitellinus)mrcaott640833ott1061609)mrcaott129404ott640833)Manacus)mrcaott128718ott129404)mrcaott105307ott128718)mrcaott49248ott105307)mrcaott49248ott771685)mrcaott25827ott49248,((((Neopelma_chrysocephalum)mrcaott356287ott1061598)mrcaott175383ott356287)Neopelma)mrcaott105300ott260633)mrcaott25827ott105300)mrcaott25827ott232053)mrcaott8441ott25827)mrcaott8441ott41222)mrcaott3212ott8441)mrcaott3212ott33874)mrcaott246ott3212,((Acanthisitta_chloris)Acanthisitta)Acanthisittidae)Passeriformes,((Strigops_greyii)Strigops)Psittaciformes)mrcaott246ott7113,((((((((((((Falco_peregrinus)mrcaott47588ott432081)mrcaott47588ott48171)mrcaott47588ott183625)mrcaott47588ott352522)mrcaott47588ott352524)mrcaott47588ott137528)mrcaott47588ott201377)mrcaott47588ott179290)mrcaott47588ott748842)mrcaott47588ott225286)Falconidae)Falconiformes)mrcaott246ott47588)mrcaott246ott3600042)mrcaott246ott2907,(((((((((((((((((Haliaeetus_leucocephalus)mrcaott506937ott773042)mrcaott506937ott773043)mrcaott327488ott506937)Haliaeetus)mrcaott68001ott95329)mrcaott8285ott68001)mrcaott8285ott919198)mrcaott8285ott11582)mrcaott8285ott34284)mrcaott1858ott8285)mrcaott1858ott103122)mrcaott1858ott47576)Accipitrinae)Accipitridae)mrcaott1858ott806938)mrcaott1858ott1036186)Accipitriformes)mrcaott246ott1858)mrcaott246ott928360,((((((((((((Calidris_pugnax)mrcaott651066ott1090732)mrcaott24121ott651066)mrcaott24121ott214779)mrcaott24121ott654830)mrcaott24121ott45306)mrcaott24121ott217797)Scolopacidae)mrcaott5272ott24121)mrcaott5272ott7639,((((((((((Charadrius_vociferus)mrcaott234677ott661811)mrcaott129402ott234677)mrcaott129402ott3596997)mrcaott129402ott238463)mrcaott129402ott214792)mrcaott129402ott673638)mrcaott112937ott129402)mrcaott57823ott112937)mrcaott57823ott242771)mrcaott57823ott57827)mrcaott5272ott57823)mrcaott5272ott92263,(((((((((((((((((Egretta_garzetta)mrcaott126087ott273425)mrcaott55044ott126087)mrcaott55044ott1032052)mrcaott55044ott744560)mrcaott55044ott628056)mrcaott55044ott105529)mrcaott55044ott244700)mrcaott55044ott105531)mrcaott55044ott154815)Ardeidae)mrcaott55044ott316989,((((Nipponia_nippon)Nipponia)mrcaott192642ott453063)mrcaott192642ott242774)Threskiornithidae)mrcaott55044ott192642)mrcaott9830ott55044)mrcaott9830ott324158,(((((((((((((((((Pygoscelis_adeliae)Pygoscelis,(Aptenodytes_forsteri)Aptenodytes)mrcaott134466ott494361)mrcaott60413ott134466)mrcaott60413ott4130817)mrcaott60413ott4130819)mrcaott60413ott3600129)mrcaott60413ott3600128)mrcaott60413ott3600124)mrcaott60413ott4130831)mrcaott60413ott3600127)mrcaott60413ott4130830)mrcaott60413ott4130835)mrcaott60413ott4130813)mrcaott60413ott3600120)Spheniscidae)Sphenisciformes)mrcaott18206ott60413)mrcaott9830ott18206)mrcaott9830ott90560)mrcaott9830ott86672)mrcaott5272ott9830)mrcaott246ott5272)mrcaott246ott7145,(((((((((((Tauraco_erythrolophus)mrcaott487306ott772747)mrcaott331533ott487306)mrcaott331533ott487309)mrcaott331533ott650221)mrcaott331533ott3600773)mrcaott331533ott582164)mrcaott331533ott842345)Musophagidae)Musophagiformes)mrcaott5021ott198671,((((((((((((((((Columba_livia)mrcaott320359ott938416)mrcaott320359ott921832)mrcaott320359ott767317)mrcaott320359ott493986)mrcaott277817ott320359)mrcaott51607ott277817)Columba)mrcaott51607ott244134)mrcaott51607ott67614)mrcaott51607ott277822)mrcaott45505ott51607)mrcaott45505ott506098)mrcaott45505ott50388)mrcaott17146ott45505)Columbiformes)mrcaott17146ott57819)mrcaott5021ott17146)mrcaott246ott5021)mrcaott246ott5481,((((((((((((((((Meleagris_gallopavo)Meleagris)Meleagridinae)mrcaott4765ott446490,((((((Phasianus_colchicus)Phasianus)mrcaott102722ott137547)mrcaott53700ott102722)mrcaott53700ott309383)mrcaott53700ott466627)mrcaott53700ott572162)mrcaott4765ott53700)mrcaott4765ott51354)mrcaott4765ott415487,(((((((Gallus_gallus)mrcaott153572ott240568)mrcaott153554ott153572)Gallus)mrcaott153554ott867027)mrcaott49310ott153554)mrcaott49310ott51349)mrcaott49310ott102705)mrcaott4765ott49310)mrcaott4765ott49319)mrcaott4765ott54193)mrcaott4765ott151684)mrcaott4765ott104461)mrcaott4765ott75785)mrcaott4765ott109888)mrcaott4765ott6520194)Galliformes,((((((((((((((((((((Anas_platyrhynchos)mrcaott82410ott190881)mrcaott82410ott604182)mrcaott82410ott604175)mrcaott82410ott339494)mrcaott30850ott82410)mrcaott30850ott604172)mrcaott30850ott30855)mrcaott30850ott30858)mrcaott30850ott82414)mrcaott30850ott82420)mrcaott30845ott30850)mrcaott30843ott30845)mrcaott30843ott196654)mrcaott30843ott30847)mrcaott30843ott145504)mrcaott30843ott962771,((((((((Cygnus_atratus)mrcaott140312ott140314)mrcaott140312ott817994)Cygnus)mrcaott75874ott140312)Anserinae)mrcaott75874ott1082830,((((((Oxyura_jamaicensis)mrcaott245432ott249658)mrcaott88380ott245432)mrcaott88380ott432044)Oxyura)mrcaott88380ott892281)mrcaott88380ott317736)mrcaott75874ott88380)mrcaott75874ott432041)mrcaott30843ott75874)Anatidae)mrcaott30843ott714464)Anseriformes)Galloanserae)Neognathae,(((((((((Apteryx_rowi)mrcaott165688ott412972)Apteryx)Apterygidae)Apterygiformes)mrcaott84218ott165688,(((Dromaius_novaehollandiae)Dromaius)Dromaiidae)Casuariiformes)mrcaott84218ott402459,(((((((((((Nothoprocta_perdicaria)mrcaott402446ott3600806)mrcaott402446ott3600803)mrcaott332613ott402446)Nothoprocta)mrcaott292464ott892275)mrcaott224379ott292464)mrcaott224379ott292461)Tinamidae)Tinamiformes)mrcaott167137ott6150815)mrcaott167137ott208456)mrcaott84218ott167137)mrcaott84218ott857860)Palaeognathae)Aves;")
        var newickNodes = []
        function buildNewickNodes(node, callback) {
          newickNodes.push(node)
          if (node.branchset) {
            for (var i=0; i < node.branchset.length; i++) {
              buildNewickNodes(node.branchset[i])
            }
          }
        }
        buildNewickNodes(newick)

        d3.phylogram.build('#phylogram', newick, {
          width: 300,
          height: 600,
          skipBranchLengthScaling: true
        });
      }
    </script>
    <style type="text/css" media="screen">
      body { font-family: "Helvetica Neue", Helvetica, sans-serif; }
      td { vertical-align: top; }
    </style>
  </head>
  <body onload="load()">
    <table>
      <tr>
        <td>
          <div id='phylogram'></div>
        </td>
      </tr>
    </table>
  </body>
</html>