tskit-dev / what-is-an-arg-paper

Manuscript and code for the "What is an ARG?" paper
1 stars 8 forks source link

Data availability statement is not in correct form #434

Closed jeromekelleher closed 4 months ago

jeromekelleher commented 9 months ago

The data availability section should be a statement in exactly the same words as used for journal submissions. It should not contain any cross-references to the document, and just say the minimum required.

Check required form for journal and make sure we get all the things needed in there.

hyanwong commented 9 months ago

Great point. I'll check on this.

hyanwong commented 9 months ago

Here are the examples:

https://academic.oup.com/genetics/pages/data-policy#Data%20Availability

hyanwong commented 9 months ago

Minimum would be:

Code used to generate the simulated data can be found at https://github.com/tskit-dev/what-is-an-arg-paper/releases/tag/1.0

However, that isn't very helpful to the reader, as it's not clear which simulated data we are talking about (e.g. are Figs 1 and 2 "simulated data"). It also doesn't mention the code used for plotting, which in this case is, I think, important. So (if we are trying not to mention which figures we are talking about) we could say e.g.

Code used to generate figures can be found at XXX. Code used to perform Wright-Fisher simulations can be found at YYY. Code and details of software used to perform ARG inference can be found at ZZZ.

We should provide links to GH with the /release/tags/1.0 suffix, I think. I can reorganise the code in the repo so that XXX, YYY, and ZZZ are obvious links?

jeromekelleher commented 9 months ago

I think docs for which bit of code goes with which plot belongs here in the repo. We just state that all code required is here, in the paper.

This is a statement, not a roadmap or documentation.

hyanwong commented 9 months ago

You mean we say "Code used to generate figures, including Wright-Fisher simulations and ARG inferences can be found at XXX"? I guess that would work.

I reckon it's worth putting the WF bit in because (a) we talk about the simulation setup and (b) we don't have code to "simulate" figs 1, 2, 3, A1, and A2.

If we just put "generate the figures" it's not clear that this involves the simulator, right? If I were a pedantic reader, that wouldn't be enough for me to be convinced that we were actually making the simulation code available, without actually visiting the repo: it could be a closed-source simulator called by the figure-generating code.

jeromekelleher commented 4 months ago

I've cut this down to a minimal statement in #460.

If you'd like to give a roadmap to where stuff can be found, I think the README.md file on the Repo would be a great place to put this. Current README can be entirely replaced.

hyanwong commented 4 months ago

OK. I'll do the README.