Closed petrelharp closed 6 years ago
For N=5e4, four threads were used. The speedup is not that impressive for a variety of reasons, though, which are hard to tease apart w/o a lot more work. False sharing is one possible issue, but I also thing that 5e4 diploids may actually be too few to make parallelization "worth" it when you're spawning new threads each generation. Creating/destroying threads is surprisingly costly, meaning that a "thread pool" model would likely be much more efficient.
Question: should we switch lots of the msprime
s over to tskit
s? For instance:
The msprime Python Tables API provides a general purpose toolkit for importing and processing succinct tree sequences.
would become
The tskit Python Tables API provides a general purpose toolkit for importing and processing succinct tree sequences.
What do you think? @jeromekelleher @molpopgen @ashander ?
Other than that question, we may be ready to re-submit. Let me know.
I've been debating this myself. To do that, I think we need to either:
Have a tskit. this would be ideal, but probably not realistic, because it gives Jerome an epic homework assignment that he probably doesn't need right now.
Be very careful about what we mean. We'd need to have a caveat paragraph that says that what we're calling A is really part of B right now, but that the plan is to give birth to A as soon as humanly possible.
On Wed, Jun 6, 2018 at 10:43 AM Peter Ralph notifications@github.com wrote:
Question: should we switch lots of the msprimes over to tskits? For instance:
The msprime Python Tables API provides a general purpose toolkit for importing and processing succinct tree sequences.
would become
The tskit Python Tables API provides a general purpose toolkit for importing and processing succinct tree sequences.
What do you think? @jeromekelleher https://github.com/jeromekelleher @molpopgen https://github.com/molpopgen @ashander https://github.com/ashander ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petrelharp/ftprime_ms/pull/82#issuecomment-395152488, or mute the thread https://github.com/notifications/unsubscribe-auth/AGHnHwqTyKUtjz4lA8m6QbK3zgxFLdtzks5t6BNHgaJpZM4UX-oW .
--
Kevin Thornton
Associate Professor
Ecology and Evolutionary Biology
UC Irvine
I'm starting to think we should change them to tskit
s. If we don't, the paper will be out of date a few months after it is published, which will surely lead to more confusion in the long-term.
The homework assignment isn't an option I'm afraid. What we could do without too much work is put together a holding page for tskit explaining that stuff is still in msprime and we're currently working on splitting them out. If we're up front with this in our cover letter then surely anyone but the most pedantic of reviewers would be OK with it. We would need to be clear that this is not the "tskit paper" though.
Another option would be to make a very shallow tskit
Python package that just imports from msprime and copy over a subset of msprime's documentation. It would satisfy the letter of "tskit existing".
This is fine with me. I don't think a standalone tskit is required here. The idea of a Python skeleton package has some appeal, but again probably isn't required. We can, in jest, suggest the following:
import msprime as tskit.
But, the current reality is that msprime does the job, and anyone wanting the C API directly can/should be able to figure out what .c/.h files to copy into their own projects and/or use msprime as a git submodule as a stop-gap solution, referring to the relevant source files in their project's Makefile or whatnot.
On Wed, Jun 6, 2018 at 12:26 PM Jerome Kelleher notifications@github.com wrote:
I'm starting to think we should change them to tskits. If we don't, the paper will be out of date a few months after it is published, which will surely lead to more confusion in the long-term.
The homework assignment isn't an option I'm afraid. What we could do without too much work is put together a holding page for tskit explaining that stuff is still in msprime and we're currently working on splitting them out. If we're up front with this in our cover letter then surely anyone but the most pedantic of reviewers would be OK with it. We would need to be clear that this is not the "tskit paper" though.
Another option would be to make a very shallow tskit Python package that just imports from msprime and copy over a subset of msprime's documentation. It would satisfy the letter of "tskit existing".
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petrelharp/ftprime_ms/pull/82#issuecomment-395184527, or mute the thread https://github.com/notifications/unsubscribe-auth/AGHnH5bHQ-zaBym_DM5IGQ2FhIEzt31nks5t6CwjgaJpZM4UX-oW .
--
Kevin Thornton
Associate Professor
Ecology and Evolutionary Biology
UC Irvine
This is fine with me. I don't think a standalone tskit is required here. The idea of a Python skeleton package has some appeal, but again probably isn't required. We can, in jest, suggest the following:
import msprime as tskit.
Now that is a truly excellent idea!
OK, I'll make a pass through changing the appropriate things to tskit
.
I've switched things over to tskit
in the right place, and propose this for the tskit
README:
Currently, the `tskit` tools are still bundled with the Python package [`msprime`](https://github.com/tskit-dev/msprime). We expect to separate the two soon, but to use `tskit` in the meantime, [install `msprime`](https://msprime.readthedocs.io/en/latest/installation.html), and then
`` `
import msprime as tskit
`` `
will get you the functionality that `tskit` will provide in the future.
updated re: @ashander's comments.
Also spotted: "which grows with $\log(n)$, as shown in Figure~\ref{fig:simplify_complexity}C,"
(Trailing comma)
OK, I've made a full pass through and it's looking great! Thanks for all the hard work @petrelharp, submit at will!
Here's a go at responding. Anything is fair game to edit, but there's just a few TODOs, including one question for @molpopgen. Please have a look, @jeromekelleher @ashander.