petrelharp / ftprime_ms

4 stars 2 forks source link

responses #82

Closed petrelharp closed 6 years ago

petrelharp commented 6 years ago

Here's a go at responding. Anything is fair game to edit, but there's just a few TODOs, including one question for @molpopgen. Please have a look, @jeromekelleher @ashander.

molpopgen commented 6 years ago

For N=5e4, four threads were used. The speedup is not that impressive for a variety of reasons, though, which are hard to tease apart w/o a lot more work. False sharing is one possible issue, but I also thing that 5e4 diploids may actually be too few to make parallelization "worth" it when you're spawning new threads each generation. Creating/destroying threads is surprisingly costly, meaning that a "thread pool" model would likely be much more efficient.

petrelharp commented 6 years ago

Question: should we switch lots of the msprimes over to tskits? For instance:

The msprime Python Tables API provides a general purpose toolkit for importing and processing succinct tree sequences.

would become

The tskit Python Tables API provides a general purpose toolkit for importing and processing succinct tree sequences.

What do you think? @jeromekelleher @molpopgen @ashander ?

petrelharp commented 6 years ago

Other than that question, we may be ready to re-submit. Let me know.

molpopgen commented 6 years ago

I've been debating this myself. To do that, I think we need to either:

  1. Have a tskit. this would be ideal, but probably not realistic, because it gives Jerome an epic homework assignment that he probably doesn't need right now.

  2. Be very careful about what we mean. We'd need to have a caveat paragraph that says that what we're calling A is really part of B right now, but that the plan is to give birth to A as soon as humanly possible.

On Wed, Jun 6, 2018 at 10:43 AM Peter Ralph notifications@github.com wrote:

Question: should we switch lots of the msprimes over to tskits? For instance:

The msprime Python Tables API provides a general purpose toolkit for importing and processing succinct tree sequences.

would become

The tskit Python Tables API provides a general purpose toolkit for importing and processing succinct tree sequences.

What do you think? @jeromekelleher https://github.com/jeromekelleher @molpopgen https://github.com/molpopgen @ashander https://github.com/ashander ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petrelharp/ftprime_ms/pull/82#issuecomment-395152488, or mute the thread https://github.com/notifications/unsubscribe-auth/AGHnHwqTyKUtjz4lA8m6QbK3zgxFLdtzks5t6BNHgaJpZM4UX-oW .

--

Kevin Thornton

Associate Professor

Ecology and Evolutionary Biology

UC Irvine

http://www.molpopgen.org

http://github.com/ThorntonLab

http://github.com/molpopgen

jeromekelleher commented 6 years ago

I'm starting to think we should change them to tskits. If we don't, the paper will be out of date a few months after it is published, which will surely lead to more confusion in the long-term.

The homework assignment isn't an option I'm afraid. What we could do without too much work is put together a holding page for tskit explaining that stuff is still in msprime and we're currently working on splitting them out. If we're up front with this in our cover letter then surely anyone but the most pedantic of reviewers would be OK with it. We would need to be clear that this is not the "tskit paper" though.

Another option would be to make a very shallow tskit Python package that just imports from msprime and copy over a subset of msprime's documentation. It would satisfy the letter of "tskit existing".

molpopgen commented 6 years ago

This is fine with me. I don't think a standalone tskit is required here. The idea of a Python skeleton package has some appeal, but again probably isn't required. We can, in jest, suggest the following:

import msprime as tskit.

But, the current reality is that msprime does the job, and anyone wanting the C API directly can/should be able to figure out what .c/.h files to copy into their own projects and/or use msprime as a git submodule as a stop-gap solution, referring to the relevant source files in their project's Makefile or whatnot.

On Wed, Jun 6, 2018 at 12:26 PM Jerome Kelleher notifications@github.com wrote:

I'm starting to think we should change them to tskits. If we don't, the paper will be out of date a few months after it is published, which will surely lead to more confusion in the long-term.

The homework assignment isn't an option I'm afraid. What we could do without too much work is put together a holding page for tskit explaining that stuff is still in msprime and we're currently working on splitting them out. If we're up front with this in our cover letter then surely anyone but the most pedantic of reviewers would be OK with it. We would need to be clear that this is not the "tskit paper" though.

Another option would be to make a very shallow tskit Python package that just imports from msprime and copy over a subset of msprime's documentation. It would satisfy the letter of "tskit existing".

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petrelharp/ftprime_ms/pull/82#issuecomment-395184527, or mute the thread https://github.com/notifications/unsubscribe-auth/AGHnH5bHQ-zaBym_DM5IGQ2FhIEzt31nks5t6CwjgaJpZM4UX-oW .

--

Kevin Thornton

Associate Professor

Ecology and Evolutionary Biology

UC Irvine

http://www.molpopgen.org

http://github.com/ThorntonLab

http://github.com/molpopgen

jeromekelleher commented 6 years ago

This is fine with me. I don't think a standalone tskit is required here. The idea of a Python skeleton package has some appeal, but again probably isn't required. We can, in jest, suggest the following:

import msprime as tskit.

Now that is a truly excellent idea!

petrelharp commented 6 years ago

OK, I'll make a pass through changing the appropriate things to tskit.

petrelharp commented 6 years ago

I've switched things over to tskit in the right place, and propose this for the tskit README:

Currently, the `tskit` tools are still bundled with the Python package [`msprime`](https://github.com/tskit-dev/msprime). We expect to separate the two soon, but to use `tskit` in the meantime, [install `msprime`](https://msprime.readthedocs.io/en/latest/installation.html), and then
 `` `
 import msprime as tskit
 `` `
will get you the functionality that `tskit` will provide in the future.
petrelharp commented 6 years ago

updated re: @ashander's comments.

jeromekelleher commented 6 years ago

Also spotted: "which grows with $\log(n)$, as shown in Figure~\ref{fig:simplify_complexity}C,"

(Trailing comma)

jeromekelleher commented 6 years ago

OK, I've made a full pass through and it's looking great! Thanks for all the hard work @petrelharp, submit at will!