koadman / proxigenomics

Hi-C analysis of heterogeneous samples
GNU General Public License v2.0
5 stars 2 forks source link

Integrate SG evolver #17

Closed cerebis closed 9 years ago

cerebis commented 9 years ago

Sg Evolver needs to be integrated to make this a complete pipeline.

koadman commented 9 years ago

we've got a good start with the CAMI integration of sgEvolver. It creates 40 simulated genomes on a fixed tree topology. Should be able to lift it directly from CAMI and once it's in and working, think about extending to more complicated evolutionary scenarios

cerebis commented 9 years ago

This is mostly complete. A few hanging questions before I can consider it finalised.

  1. Is breakSimulatedGenomeOnAncestralContigs necesary? It would be preferable to avoid putting the resulting multi-contig sequences back together as a single file. I would rather just take evolved_seq.fas, with a single contig per taxon if that is the outcome of the evolutionary model.
  2. It seems that the requested length of the generated sequence is overridden in simujobrun.pl. I am specifying 1Mbp but getting near the full length of the input.
koadman commented 9 years ago

No need for breakSimulatedGenomeOnAncestralContigs, that's a CAMI special

Yes, the simujobrun.pl from CAMI sets ancestral sequence length to the full sequence length. This is unnecessary and can be removed. It was done in CAMI to simplify usage across a wide variety of inputs, but doing so changes the nature of the donor gene pool for gene gains.