richelbilderbeek / razzo

Research project by Giovanni Laudanno and Richel J.C. Bilderbeek
GNU General Public License v3.0
2 stars 2 forks source link

Show number of multiple-birth events #279

Closed richelbilderbeek closed 5 years ago

richelbilderbeek commented 5 years ago

From @rsetienne:

Furthermore, from the parameter values I can't tell how many mbd events will be visible in the tree, so it would be good to present results on this as well. Then you can also look at the error as a function of number of tips in the tree and number of mbd events in the full tree or in the reconstructed tree. This will inform the empiricist whether his/her tree will be likely to deviate from the true tree a lot or not.

richelbilderbeek commented 5 years ago

Wrote a function, collect_n_mb_events and its test. Within the function are two stubs:

#' Collect the number of multiple-birth (MB) events
# ...
collect_n_mb_events <- function(
  project_folder_name
) {
  # ...

  # Issue 279, Issue #279
  # STUB

  # ...

  # Issue 279, Issue #279
  # STUB

  # ...
}
richelbilderbeek commented 5 years ago

@Giappo: I assign you this Issue, as I think you would enjoy it. No worries about the time: I will re-assign to myself when really needing this :+1:

Giappo commented 5 years ago

We need to define how we want to measure the strength of an mbd event.

Example: We have two sets of branching times:

brts_a <- c(10, 9, 8, 8, 8, 7, 7, 7, 7, 7) and brts_b <- c(10, 9, 8, 8, 7, 7)

We can define three different metrics to measure the mbd-ness

1) We count the number of multiple events: 1a) count_n_mb_events_1(brts_a) is 2 1b) count_n_mb_events_1(brts_b) is 2 Pros: It's simple. Cons: You can see they are the same even if the mbd-ness is clearly stronger in the 1st case.

2) We can consider the species produced by the multiple events: 2a) count_n_mb_events_2(brts_a) is 3 + 5 = 8 2b) count_n_mb_events_2(brts_b) is 2 + 2 = 4 Pros: This actually takes into account the different extent of the events. Cons: You do not count mbd event that produce only one species (I HATE them!).

3) We can consider the species produced by the multiple events after the first ones: 3a) count_n_mb_events_3(brts_a) is 2 + 4 = 6 3b) count_n_mb_events_3(brts_b) is 1 + 1 = 2 Pros: This actually takes into account the different extent of the events, as in 2. Furthermore single events (which are undistinguishable from lambda events) are not taken into account. Cons: It's a bit complicated.

I might have overlooked possible pros and cons. Lemme know your opinions on those so we can choose the best criterion. 🌈

richelbilderbeek commented 5 years ago

Option 3 is definitely my favorite one! I predict also @rsetienne would prefer that option :rainbow:

richelbilderbeek commented 5 years ago

Looks good, closing this, as it will be continued with #289.