New experimental setup - Githubissues

lutteropp commented 3 years ago

For all experiments, we will have (unless mentioned otherwise):

number of MSA sites = 1000 * (number of displayed trees)
perfect sampling: sites are sampled proportionally to the displayed tree probability
each displayed tree gets its own partition in the MSA
all reticulations with probability 0.5
we will not simulate any "weird" networks (this is, networks with unrecoverable reticulations)
two setups: starting from raxml-ng best tree vs. starting from 5 parsimony trees + 5 random trees
two likelihood models: LikelihoodModel.BEST and LikelihoodModel.AVERAGE

I am scripting and then submitting the following experiments:

10 taxa, 1 reticulation with probability in {0.1, 0.2, 0.3, 0.4, 0.5} (fixed topology per dataset, just changing the reticulation prob).
10 taxa, 1 reticulation, brlen_scaler in {1,2,4,8} (fixed topology per dataset, just scaling the branches).
10 taxa, number of reticulations in {1,2,3}
10 taxa, 1 reticulation, unpartitioned dataset, LikelihoodModel.AVERAGE

I chose 10 taxa, because it is not super few taxa, but also not extremely large.

stamatak commented 3 years ago

sounds good Sarah, thank you,

Alexis

On 17.02.21 13:57, Sarah Lutteropp wrote:

For all experiments, we will have (unless mentioned otherwise):

number of MSA sites = 1000 * (number of displayed trees)

perfect sampling: sites are sampled proportionally to the displayed tree probability

each displayed tree gets its own partition in the MSA

all reticulations with probability 0.5

we will not simulate any "weird" networks (this is, networks with unrecoverable reticulations)

two setups: starting from raxml-ng best tree vs. starting from 5 parsimony trees + 5 random trees

two likelihood models: LikelihoodModel.BEST and LikelihoodModel.AVERAGE

I am scripting and then submitting the following experiments now:

10 taxa, 1 reticulation with probability in {0.1, 0.2, 0.3, 0.4, 0.5} (fixed topology per dataset, just changing the reticulation prob).

10 taxa, 1 reticulation, brlen_scaler in {1,2,4,8} (fixed topology per dataset, just scaling the branches).

10 taxa, number of reticulations in {1,2,3}

10 taxa, 1 reticulation, unpartitioned dataset, LikelihoodModel.AVERAGE

I chose 10 taxa, because it is not super few taxa, but also not extremely large.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/46, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6QE5PRAQZMMOGQHH6LS7OVI7ANCNFSM4XYG7VQQ.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org

lutteropp commented 3 years ago

Bad news: The only experiment that finished within 24 hours was "10 taxa, 1 reticulation, unpartitioned dataset, LikelihoodModel.AVERAGE". And in this one, it looks like NetRAX inferred a tree all the time...

For "10 taxa, 1 reticulation with probability in {0.1, 0.2, 0.3, 0.4, 0.5}" and "10 taxa, 1 reticulation, brlen_scaler in {1,2,4,8}", NetRAX was too slow and the job got killed after running 24 hours.

For "10 taxa, number of reticulations in {1,2,3}", only simulated networks for 1 and 2 reticulations were generated. Looks like there is no way to have 3 reticulations in a 10 taxon dataset without having a weird network. Thus, the entire 24 hours were spent in endlessly re-trying simulating a 3-reticulation network...

Proposed quick-fix solution:

Report the disappointing result for unpartitioned datasets, which says: With the "new" network likelihood definition, we depend on having a partitioned MSA as input, otherwise we will just infer a tree.
Implement a "restart experiment, skip already existing files" mode in the experiment scripts
Increase the number of taxa in the "10 taxa, number of reticulations in {1,2,3}" setting to 15.

lutteropp commented 3 years ago

I am resubmitting all the experiments from this issue with 15 taxa now. We now can use all threads on a cluster node in NetRAX (with disabled site repeats, thanks to Pierre who parallelized pll_update_partials with OpenMP), meaning we get a theoretical best-case speedup of 16x :) (ok ok, 6x or so is more likely... still, it should suffice for this set of experiments)

stamatak commented 3 years ago

that's great but why don't you just use our lab-owned servers that do not have any run time limit for the experiments?

On 24.02.21 19:21, Sarah Lutteropp wrote:

I am resubmitting all the experiments from this issue with 15 taxa now. We now can use all threads on a cluster node in NetRAX (with disabled site repeats, thanks to Pierre who parallelized pll_update_partials with OpenMP), meaning we get a theoretical best-case speedup of 16x :) (ok ok, 6x or so is more likely... still, it should suffice for this set of experiments)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/46#issuecomment-785237927, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6QGOPTKMFG7WWNJLELTAUYSHANCNFSM4XYG7VQQ.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org

lutteropp commented 3 years ago

Switching to another server that has no time limit doesn't help with:

3 reticulations on a 10-taxon-nonweird network being impossible
Some potential bugs or endless loops (e.g., in the experiment scripts)

I want to see at least some result files being already there, to be fully convinced that it is just a time limit issue. Also, with these empirical datasets being way more demanding in size than we first thought, we need a faster NetRAX anyway.

stamatak commented 3 years ago

okay

On 25.02.21 13:10, Sarah Lutteropp wrote:

Switching to another server that has no time limit doesn't help with:

3 reticulations on a 10-taxon-nonweird network being impossible

Some potential bugs or endless loops (e.g., in the experiment scripts)

I want to see at least some result files being already there, to be fully convinced that it is just a time limit issue. Also, with these empirical datasets being way more demanding in size than we first thought, we need a faster NetRAX anyway.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/46#issuecomment-785816827, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6UWKREPPYLLVXGANJLTAYV3ZANCNFSM4XYG7VQQ.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org

lutteropp commented 3 years ago

I figured out a problem: The new experiment script had a bug in change reticulation probs -> it overwrote the network file with a tree

That's why I kept inferring trees on all datasets... I've fixed the issue, aborted the experiments, deleted their generated files, and resubmitted them.

lutteropp commented 3 years ago

(Copy&paste from Slack message) I redid the simulation experiments and changed the experimental settings. The old ones appeared useless to me. The new experimental setting is:

Always start from best raxml-ng tree. (Reason: Doing multiple runs from various random/parsimony trees was too slow for larger datasets. And we already get nice results just starting from best raxml-ng tree)
Always use linked brlens.
Evaluate two likelihood models: BEST and AVERAGE. (Spoiler: Not really a difference in the results - but likely due to how we simulated the data, having each partition following a single displayed tree)
n_taxa = {10,15,20,25,30,35,40}
n_reticulations = {1,2,3,4}
as many partitions as displayed trees, 1000 simulated MSA sites per partition
one haswell node per experiment (sbatch -N 1 -n 1)

For each combination in n_taxa x n_reticulations, do the following experiments:

do inference on unpartitioned dataset (spoiler: this always led to 0 inferred reticulations)
change brlen scaler {1, 2, 3, 4}
change reticulation prob {0.1, 0.2, 0.3, 0.4, 0.5}
standard (brlen_scaler=1, reticulation prob = 0.5)

All experiments already finished on haswell. I am now evaluating the results and writing this up in the paper. It's damn nice that NetRAX can already infer networks on simulated data with 40 taxa and 4 reticulations. :sunglasses: (in reasonable time, but with a normal MSA, not a super-big killer MSA like the one in the empirical dataset)

lutteropp commented 3 years ago

Regarding evaluation of Likelihoodmodel.BEST vs. LikelihoodModel.AVERAGE: If we want to do this properly, then we need to scramble partitions and somewhat randomly assign sites from different displayed trees to a partition. The only insight on the two likelihood models we have so far is that both fail if the MSA is unpartitioned, and both perform equally well if the MSA is perfectly partitioned.

But is this realistic in biological datasets, having partitions that contain sites from different evolutionary histories?

lutteropp commented 3 years ago

The problem with starting the network search from multiple random/parsimony trees is:

For low number of taxa, it is unlikely that raxml-ng best tree is a bad starting choice.
For low number of reticulations, it is unlikely that starting from raxml-ng best tree is a problem.
For high number of taxa + high number of reticulations, we are still way too slow to do multiple searches. We can already be happy if we finish a single search run.

stamatak commented 3 years ago

I still believe that several random and parsimony starting trees should absolutely be included, at least for the smaller and/or simpler datasets if not feasible for the large ones to have a good reference and justification why we think the RAxML-NG trees are better.

If we only start with RAxML-NG trees a reviewer will surely ask for comparative tests with random starting trees.

Alexis

On 21.06.21 13:25, Sarah Lutteropp wrote:

The problem with starting the network search from multiple random/parsimony trees is:

For low number of taxa, it is unlikely that raxml-ng best tree is a bad starting choice.

For low number of reticulations, it is unlikely that starting from raxml-ng best tree is a problem.

For high number of taxa + high number of reticulations, we are still way to slow to do multiple searches.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/46#issuecomment-864920421, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6STBGYNVSBX2WONDV3TT4HQPANCNFSM4XYG7VQQ.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology Affiliated Scientist, Evolutionary Genetics and Paleogenomics (EGP) lab, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology Hellas

www.exelixis-lab.org

stamatak commented 3 years ago

On 21.06.21 12:27, Sarah Lutteropp wrote:

Regarding evaluation of Likelihoodmodel.BEST vs. LikelihoodModel.AVERAGE: If we want to do this properly, then we need to scramble partitions and somewhat randomly assign sites from different displayed trees to a partition. The only insight on the two likelihood models we have so far is that both fail if the MSA is unpartitioned, and both perform equally well if the MSA is perfectly partitioned. But is this realistic in biological datasets, having partitions that contain sites from different evolutionary history?

I wouldn't worry to much about the realism, I think that some scrambling should be done anyhow to properly assess the model. One could maybe think of various ways in which this could occur (I don't know it for sure, just speculating here), for instance, only certain introns or exons that are shorter than a gene/partition could be transferred during reticulations.

Alexis

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/46#issuecomment-864881956, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6SWTE3S4OGB5G2BZRTTT4AXJANCNFSM4XYG7VQQ.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology Affiliated Scientist, Evolutionary Genetics and Paleogenomics (EGP) lab, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology Hellas

www.exelixis-lab.org

stamatak commented 3 years ago

(Copy&paste from Slack message) I redid the simulation experiments and changed the experimental settings. The old ones appeared useless to me. The new experimental setting is:

Always start from best raxml-ng tree. (Reason: Doing multiple runs from various random/parsimony trees was too slow for larger datasets. and we already get nice results just starting from best raxml-ng tree)

Always use linked brlens.

Again, at least for the smaller/easier datasets we should have and include data for the other br-len options to have a justification why we focus on linked br-lens in the larger experiments.

Evaluate two likelihood models: BEST and AVERAGE. (Spoiler: Not really a difference in the results - but likely due to how we simulated the data, having each partition following a single displayed tree)

n_taxa = {10,15,20,25,30,25,40}

n_reticulations = {1,2,3,4}

as many partitions as displayed trees, 1000 simulated MSA sites per partition

For each combination in n_taxa x n_reticulations, do the following experiments:

do inference on unpartitioned dataset (spoiler: this always led to 0 inferred reticulations)

change brlen scaler {1, 2, 3, 4}

I don't get what brlen scaler means here and how you can just set it to those values.

change reticulation prob {0.1, 0.2, 0.3, 0.4, 0.5}

standard (brlen_scaler=1, reticulation prob = 0.5)

Okay.

All experiments already finished on haswell. I am now evaluating the results and writing this up in the paper. It's damn nice that NetRAX can already infer networks on simulated data with 40 taxa and 4 reticulations. 😎 (in reasonable time, but with a normal MSA, not a super-big killer MSA like the one in the empirical dataset)

This is excellent :-)

Alexis

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/46#issuecomment-864874681, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6UIP44J3X7JRHFHI4LTT37RPANCNFSM4XYG7VQQ.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology Affiliated Scientist, Evolutionary Genetics and Paleogenomics (EGP) lab, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology Hellas

www.exelixis-lab.org

lutteropp commented 3 years ago

I don't get what brlen scaler means here and how you can just set it to those values.

It was some old experiment where we scaled the branches of the simulated network before simulating the sequences. A higher brlen scaler value leads to more mutations.

at least for the smaller/easier datasets we should have and include data for the other br-len options to have a justification why we focus on linked br-lens in the larger experiments.

But we simulate the sequences using linked branch lengths.

If we only start with RAxML-NG trees a reviewer will surely ask for comparative tests with random starting trees.

Good point. I'm adding some results using multiple random+parsimony starting trees on small datasets as a justification.

lutteropp commented 3 years ago

The raxml-ng best-tree based network inference was not always the one getting the best network in the end. But the differences were minimal. And the underlying problem is that we are getting stuck in local optima... Also, starting additional searches from other old non-taken promising BIC-improving states (not fully implemented though, thus landed in Future Work section) sometimes leads to a slightly better inferred network, too.

Runtime-quality-tradeoff is the main argument here.

stamatak commented 3 years ago

On 21.06.21 15:00, Sarah Lutteropp wrote:

I don't get what brlen scaler means here and how you can just set it to
those values.
It was some old experimen where we scaled the branches of the simulated network before simulating the sequences.

Okay, now it makes sense, I was missing the information that this refers to the simulation rather than the inference setting.

at least for the smaller/easier datasets we should have and
include data for the other br-len options to have a justification why we
focus on linked br-lens in the larger experiments.

But we simulated the sequences using linked branch lengths.

Still, would be nice to quantify the impact, e.g., in reality we don't know the branch model, hence knowing what the impact of using the wrong one is, is interesting.

If we only start with RAxML-NG trees a reviewer will surely ask for
comparative tests with random starting trees.
Good point. I'm adding some results using multiple random+parsimony starting trees on small datasets as a justification.

Sounds good.

Alexis

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/46#issuecomment-864975885, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6RQF2IUVF6UMCABB63TT4SXJANCNFSM4XYG7VQQ.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology Affiliated Scientist, Evolutionary Genetics and Paleogenomics (EGP) lab, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology Hellas

www.exelixis-lab.org

stamatak commented 3 years ago

sure I get the runtime argument ...

On 21.06.21 15:12, Sarah Lutteropp wrote:

The raxml-ng best-tree based network inference was not always the one getting the best network in the end. But the differences were minimal. And the underlying problem is that we are getting stuck in local optima... Already starting additional searches from other old non-taken promising states (not fully implemented though, thus landed in Future Work section) sometimes leads to a slightly better inferred network.

Runtime is the main argument here.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/46#issuecomment-864983365, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6VEZ5JH6P3C7IQCD73TT4UCBANCNFSM4XYG7VQQ.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology Affiliated Scientist, Evolutionary Genetics and Paleogenomics (EGP) lab, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology Hellas

www.exelixis-lab.org

lutteropp commented 3 years ago

in reality we don't know the branch model, hence knowing what the impact of using the wrong one is, is interesting.

Ok, I will also run some inferences with the other branch models. Let's hope the code works on them. I haven't tested them in months...

lutteropp commented 3 years ago

Regarding unlinked branches, we already had some past discussions with @celinescornavacca discussing that they are incapable of inferring some reticulations. See https://github.com/lutteropp/NetRAX/issues/33

Also, @stamatak said here that only using linked brlens should be fine: https://github.com/lutteropp/NetRAX/issues/33#issuecomment-748508711 ... but that was without the "how do wrong branch models perform" argument.

stamatak commented 3 years ago

okay, if the code doesn't work any more, we just skip these experiments

On 21.06.21 15:17, Sarah Lutteropp wrote:

in reality we don't
know the branch model, hence knowing what the impact of using the wrong
one is, is interesting.
Ok, I will also run some experiments with the other branch models. Let's hope the code works on them. I haven't tested them in months...

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/46#issuecomment-864986360, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6XKZLEDVSGXNAJUH6TTT4UT7ANCNFSM4XYG7VQQ.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology Affiliated Scientist, Evolutionary Genetics and Paleogenomics (EGP) lab, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology Hellas

www.exelixis-lab.org

lutteropp commented 3 years ago

Ok I submitted random trees and different brlen linkages experiments to haswell.

The scrambled partitions stuff is more tricky, I first need to code it.

lutteropp commented 3 years ago

Do we also need experiments for each of the highly experimental alternative search options I implemented? Scrambling the inferred network and restarting a search from that one, extreme greedy move acceptance mode with various subvariants, disabling elbow method, different move radius, endless search mode, different move type choice and order, ...

I tend to say no. Actually, I am even considering kicking them out of NetRAX entirely. Because they were at some point in time inferior to what is now chosen as the NetRAX default. Some of them also make the inference run (sometimes much) slower. By entirely kicking them out, we can make NetRAX at least a bit simpler and easier to understand. The whole thing is already complex enough in its default mode...

lutteropp commented 3 years ago

There is a problem with scaled branch lengths:

terminate called after throwing an instance of 'std::runtime_error'
  what():  I believe this function currently does not work correctly with scaled branch lengths

The problem is in computing partition likelihood derivatives. I was super confused by this s value in libpll/pll-modules back then and did not understand when to multiply the branch length with it and when not, also when to use it in the derivative computation and when not.

So, let's skip scaled branch lengths mode for now. Except someone wants to explain libpll/pll-modules internals to me such that we can be sure I handle that s value correctly (and don't apply it one time too often or one time not enough)...

  double s = 1.0;
  // double s = ann_network.fake_treeinfo->brlen_scalers ?
  // ann_network.fake_treeinfo->brlen_scalers[partition_idx] : 1.;
  double p_brlen =
      s *
      ann_network.fake_treeinfo->branch_lengths[partition_idx][pmatrix_index];
[...]
  // res.lh_prime *= s;
  // res.lh_prime_prime *= s * s;

lutteropp commented 3 years ago

I believe my problem with scaled branch lengths was that if I already multiply the branch length by the scaler, do I also need to multiply the likelihood derivatives with it or not?

lutteropp commented 3 years ago

In case someone is interested in it: This is the function where I was unsure about how to correctly handle the scaler: https://github.com/lutteropp/NetRAX/blob/master/src/likelihood/LikelihoodDerivatives.cpp#L30

lutteropp commented 3 years ago

Another problem with the experiments: So far, I just simulated a single network for each combination of n_taxa and n_reticulations.

We need yet another experiment (linked brlens, raxml-ng best tree) where we fix the number of taxa and the number of reticulations and simulate multiple networks for it. Let's go big and bragging and use 40 taxa, 4 reticulations, 100 times. :sunglasses:

lutteropp commented 3 years ago

I will be posting the merged result CLV files here: https://github.com/lutteropp/NetRAX/issues/81

stamatak commented 3 years ago

sounds good

On 22.06.21 13:14, Sarah Lutteropp wrote:

Another problem with the experiments: So far, I just simulated a single network for each combination of n_taxa and n_reticulations.

We need yet another experiment (linked brlens, raxml-ng best tree) where we fix the number of taxa and the number of reticulations and simulate multiple networks for it. Let's go big and bragging and use 40 taxa, 4 reticulations. 😎

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/46#issuecomment-865855200, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6XMWUHMYBBQKA4QAC3TUBPBPANCNFSM4XYG7VQQ.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology Affiliated Scientist, Evolutionary Genetics and Paleogenomics (EGP) lab, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology Hellas

www.exelixis-lab.org

lutteropp / NetRAX

New experimental setup #46