Closed whedon closed 3 years ago
Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @djmitche, @rabdill it looks like you're currently assigned to review this paper :tada:.
:star: Important :star:
If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿
To fix this do the following two things:
For a list of things I can do to help you, just type:
@whedon commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@whedon generate pdf
Reference check summary:
OK DOIs
- 10.1111/j.1471-8286.2006.01256.x is OK
- 10.1016/j.ajhg.2016.05.020 is OK
- 10.2307/2409206 is OK
- 10.1093/bioinformatics/bts460 is OK
MISSING DOIs
- None
INVALID DOIs
- None
Dear @djmitche and @rabdill, thank you very much for accepting review this submission for JOSS.
Please check the instructions and checklists above (I see that @djmitche already started), and let me know if you need any assistance. You can also tag @rhagenson if you need to ask specific questions about the submission.
I'm having a bit of trouble figuring out what this does, as someone not familiar with this kind of analysis. The README is very clear about input formats, but for example doesn't explain what "relatedness" means. Is there another resource that could help me understand this sort of thing?
@djmitche Great question and a bit of oversight on my part assuming relatedness is more widely known. In short, relatedness is the measure of how much two individuals overlap genetically -- ranging from 0 (no relation) to 1 (clones). relped
began its lifecycle as a quasi-extension to a tool called ML-Relate, which is where the codes under Relatedness originated from.
Perhaps the best reference for understanding relatedness is "Estimating relatedness using genetic markers" by Queller and Goodnight.
Does that help? Do you believe I need a brief section explaining relatedness in the README?
@djmitche, I think that would be nice to explain that on the manuscript's summary.
It's actually on JOSS guidelines, the summary should "have a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided".
So a quick and simple explanation on relatedness should be there - and perhaps a slightly more detailed explanation could fit in the README.
Thanks for clarifying, @rhagenson. I agree with @marcosvital about adding some of this to the project—my only note so far was that there was very little background for how this could be applied. The software documentation itself is helpful, and I'm sure someone already doing this work would see how it fits into their workflow, but the description assumes a lot of prior knowledge.
@rabdill, @djmitche, and @marcosvital I will add a briefing on relatedness to the paper and README to frame relped
.
@rabdill, specifically, when you say "there was very little background for how this could be applied" do you mean you had difficulties identifying the usefulness of relped
, or where it fits in a hypothetical project?
If it would be helpful, I can put a few sentences in the paper about where the methodology was required. We developed relped
to build the pedigree network of hundreds of captive animals to inform national breeding programs. However, this is not the only use of relped
so I did not force this context into either the paper or documentation -- where it might distract from the form and function of the software itself.
I have shared relped
(post-entering review) with a few agriculture former colleagues, conservationists, and zoo/aquarium staff as I can see where it would be useful in those areas. Already have a couple Thank You's for building it so I am confident that when @rabdill said "someone already doing this work would see how it fits into their workflow" that is very much true.
@whedon generate pdf
@rabdill, @djmitche, and @marcosvital, the update to add a brief introduction to relatedness in both manuscript and documentation has been completed.
Dear authors and reviewers
We wanted to notify you that in light of the current COVID-19 pandemic, JOSS has decided to suspend submission of new manuscripts and to handle existing manuscripts (such as this one) on a "best efforts basis". We understand that you may need to attend to more pressing issues than completing a review or updating a repository in response to a review. If this is the case, a quick note indicating that you need to put a "pause" on your involvement with a review would be appreciated but is not required.
Thanks in advance for your understanding.
Arfon Smith, Editor in Chief, on behalf of the JOSS editorial team.
As far as I know, I am as available as usual so no immediate need to pause. However, have no expectation of the same reality among this submission's editor and reviewers.
Please be safe and protect yourselves; this submission can wait for everyone to be happy and healthy during these times. =)
On Fri, Mar 13, 2020, 7:22 PM Arfon Smith notifications@github.com wrote:
Dear authors and reviewers
We wanted to notify you that in light of the current COVID-19 pandemic, JOSS has decided to suspend submission of new manuscripts and to handle existing manuscripts (such as this one) on a "best efforts basis". We understand that you may need to attend to more pressing issues than completing a review or updating a repository in response to a review. If this is the case, a quick note indicating that you need to put a "pause" on your involvement with a review would be appreciated but is not required.
Thanks in advance for your understanding.
Arfon Smith, Editor in Chief, on behalf of the JOSS editorial team.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openjournals/joss-reviews/issues/2124#issuecomment-598981346, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJJ6PZWI7D6QGZ5GOYT46LRHLE4NANCNFSM4K5FODGA .
Hi! Working my way through the rest of the review now, thank you for your patience. I have a few minor notes, but I ran into something that might be a bigger issue, so I'll just post that until I can make more progress.
I'm not clear on whether this is a problem or not, but I noticed that the inferred connections are not consistent between runs. I realize the layout of the diagrams may change, but the readme and manuscript state that all connections will be the same between runs. Not all of the differences I found could be fixed by just adding the "undirected" option, so I wanted to bring this up just in case.
I generated four diagrams using identical inputs with
relped build --relatedness="example-data/relatedness-nums-and-codes.csv" \
--parentage="example-data/parentage.csv" --output "asdf.dot"
It generated output exactly as described, and I included the figures below with red circles around a few discrepancies I noticed:
I thought maybe this would go away with more data, so I tried running it again including the demographics file, but I found several inconsistent connections between the first 2 iterations there as well. @rhagenson, would you mind clarifying if this is expected behavior? Please let me know if it'd be helpful to have this in a different format too; I'd be happy to send it some other way. Thanks.
@rabdill, I am investigating this right now one piece at a time (so expect follow-up messages).
On M2->F3 not appearing consistently: it appears what is happening is that when M2->F3 does not appear, F7->F3 appears. What I suspect is happening is that the quasi-random order of which edges are pruned is "deciding" whether M2->F3 or F7->F3 is "chosen" for the final plot. In the input that you are using, both M2 and F7 are listed as half-siblings (HS) of F3. Given it is expected behavior (in my opinion) that ALL codified entries appear in the output, I would say that this is a bug and I need to investigate when/where codified entries are being removed.
Loops, such as in iteration 2 should be removed already. That one is going to take some head work to solve as there is already a cycle removal step.
Loops, such as in iteration 2 should be removed already. That one is going to take some head work to solve as there is already a cycle removal step.
I think I know what I did here. The step did exist, but then at a later stage of development I removed it as the (mistakenly) thought the improved pruning covered it already so removed what was thought of as a unnecessary repetition of work.
On O4->O11: the direction switch is expected behavior. Without additional parentage or demographic information, a direction is chosen. In the input you use, although you provide parentage there is no way to infer the age of O4 and O11 to direct it the same way on every iteration. However, this relationship dropping out is strange. I will need to find a pattern to when it drops out to determine the solution. My initial suspicion is that either the relationship is not being added when it should, or it is being removed because there exists some other path that is, by chance, weighted higher.
On the M1 and F8 disconnect, I suspect the same programmatic disconnect as M2->F3 or F7->F3 because M1->F8 is a codified entry of half-sibling. Ensuring any codified entries cannot be dropped (as I would expect from the program as a user) will fix this issue.
One the F2 to M2 disconnect. See the above comment on M1 and F8.
Ensuring any codified entries cannot be dropped (as I would expect from the program as a user) will fix this issue
Overall, @rabdill, it seems you IDed some regressions in the program logic which predated effective testing for such regressions.
The two remedies that should fix all the ID problems are:
Thank you for catching these issues!
I feel like I'm playing catch-up - thanks for your detailed analysis @rabdill! A few minor notes:
Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support
I think 2 and 3 are covered, but 1 is not. GitHub has suggestsions for both a contributing guide and a CoC. The CoC probably isn't going to see any use at this stage, but the few times I have had to call on a CoC I've been glad it was already in place, so I'd suggest creating one.
Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
I see that there is example data in the repo, but it's not clear which files belong together. Also, the only example usage is in the multiple-runs example. Is it expected that the software is always used in that mode, or would it make sense to give a single-run example that uses data from the example-data directory?
@djmitche
I can add a CONTRIBUTING doc and agree that a CoC is probably not going to be used at this stage, but could be useful to have in place well before it needs to be used.
The example-data
files were named according to what option in the CLI they correspond to, therefore all the relatedness-*
files go to --relatedness
and same for parentage*
and demographics*
. As for multi versus single run, I would expect multi-run is the manner most users will run it in; I only have some much control over how GraphViz will layout a particular relped
output so multi-run allows for rapidly building what should be the same semantic graph where connections are consistent, but layout changes (Rich pointed out this was not always the case). I will still add a single run example just above the multi run example as it shouldn't distract from usage by any means so is harmless at worst and helpful at best.
If the multi-run option is the normal course, it's fine to list only that example. Perhaps a small rewording of the README would help others not wonder the same thing I did :)
I can work on the wording. Adding a single-run example for initial exploration of the tool for newcomers is probably more beneficial than not so will likely still change that part of the README.
I'm sorry I've been slow to get back to this -- I really have no excuse. At this point I'm just waiting on CONTRIBUTING.
One thing in the paper that leaves me with a lingering uncertainty is "As more information is provided, connections are better resolved." Perhaps this is my ignorance of the subject matter, but given a body of relatedness information that does not uniquely identify a single pedigree, what does it mean to be "poorly resolved"? Or to ask another way, what is the prior state on which "better resolved' is based? I can imagine that a user of the software would want to know concretely, would it ever draw an edge between two individuals that would later be removed given additional (non-contradictory) information?
If that's already known to someone doing this kind of work, then please consider it a question for my own curiosity (and don't feel obliged to answer it!)
@djmitche I am not concerned about any delays right now. Not "normal" times and I would not be overly concerned during normal times anyhow. :smile:
As for the "better resolved" point, that is intentionally open wording that is understood by those doing the work already. In brief, because there is still some expert knowledge and outside information that can factor into building the "true pedigree" as opposed to the "pedigree network" that relped
builds, I do not differentiate between any information known at the start of a project and information discovered during a project. For example, relped
was written when we had to build the pedigree of captive animals, but despite best efforts we do not always "know" the relationship between captive animals (i.e., sometimes mistakes are made) so we started with relatedness scores built from genotypes then, as we confirmed the knowns we encoded those as parent-offspring (PO), full-sibling (FS), and half-sibling (HS). We built many pedigrees during the course of the project to balance those concerns which could not so readily be additional features of relped
.
This progressive pedigree building is also the reason I was so shocked when Rich pointed out that some encoded input entries do not always appear in the final pedigree. This is a breaking bug which I still need to resolve.
Does that answer your inquiry?
I will get the CONTRIBUTING added later. I intend to "steal" the content from another project of mine here.
Thanks for the explanation. The key point for my review is "wording that is understood by those doing the work already" :)
All of the checkboxes for my review are complete, so once @rabdill's concerns are addressed I think this is good to go.
:wave: all, just a friendly check-in to see how things are going with this review?
Hi @arfon. I think my review is technically the missing piece. I'm not sure how this works within the JOSS review process, but I was hoping to wait for the completion of two issues that popped up during review: https://github.com/rhagenson/relped/issues/42 https://github.com/rhagenson/relped/issues/43
In short, one of the stated expectations of the software is that its conclusions are consistent across runs, but there are some cases where that isn't the case, and identical inputs can result in different outputs. I believe @rhagenson is working on it. Based on the functional claims of the software, I'm not comfortable checking the "Functionality" box until those things are fixed. I suppose an alternative would be to change the documentation to reflect a nondeterministic implementation, but it's my impression this would be a significant step backwards for the software itself.
(I've marked my review as complete but I concur with Rich's concerns)
I agree that it should not be moved forward until I address the issues Rich noted. I apologize about any delays; I was correct that I am as available as ever, but that availability has not translated into productivity. My plan is to make significant progress on the aforementioned issues this week. Thank you to each of you for your patience with me.
No problem that some items are being worked on, but I will mark this submission as "paused" in the meantime for our internal bookkeeping.
(feel free to reassign me when this is again in motion -- un-assigning to remove it from my github dashboards)
Well, so much for that idea...
I agree that it should not be moved forward until I address the issues Rich noted. I apologize about any delays; I was correct that I am as available as ever, but that availability has not translated into productivity. My plan is to make significant progress on the aforementioned issues this week. Thank you to each of you for your patience with me.
@rhagenson@ @marcosvital - it looks like this submission is very close to crossing the line here. Is there anything we can do to help you wrap this up :soon: ?
Well, so much for that idea...
@djmitche apologies for this. @whedon 'helpfully' goes around looking for missing assignments on JOSS issues every 24 hours so even if you unassigned yourself, you'll get reassigned by Whedon.
Hi, @rhagenson! Hope everything is fine with you. Did you made any progress on the two issues? Let us know if you have any news.
@marcosvital, I am still alive. I have many reasons for the delay, but the most informative is that in May I rapidly moved from my work in conservation (which this submission was a part of) to being a dedicated COVID-19 data analyst with my local health department. I have spent the last few months consuming COVID-19 literature, responding to outbreaks, and building computational scaffolding to help get us through this pandemic. I have no intention on abandoning this work, but I also cannot say for certain how soon I will have the time and energy to resolve the final two issues.
Context, I am dedicated to the largest city affecting the blue line below (source):
Hi, @rhagenson! Thank you for the update, I'm glad to know that you are fine. Since you are not abandoning the project, we will keep this submission as paused, and wait for you.
@djmitche and @rabdill, thank you so much for all the time you dedicated into reviewing this submission. This is probably going to be the last notification for a while, until @rhagenson get to work on this again. When that happens, I can make the final checks if you happen to not be available anymore.
Hi folks -- should we consider closing this? It's been in my list of assigned issues for a long time....
Hi, @rhagenson! I hope everything is fine with you. Let us know if you have any plans to work again on this submission, ok?
Hi, @djmitche! Thank you for all the effort you put so far on this submission. While it is possible that Ryan is ready to work on it again, it is also possible that we will let this paused for a while longer. If you feel that is better to step out of the process at this point, it will be no problem at all: I can take the role of a reviewer as we move on.
If you choose to do that, my suggestion would be for you to just turn off the notifications from this issue - this way you will stay listed as a reviewer (I think this would be the best, since you already did a lot of work here).
:wave: @rhagenson – just checking in here to see if your intent is still to complete this process?
My task-tracker says it's been 1.7 years here and no response. I think it's time to close this?
You know there is still a pandemic going on, right? And as mentioned above, I now work in epidemiology? Your task-tracker is my lowest priority.
@arfon My intent is still to complete this; that has not changed. However, nothing about my ability to reprioritize this has change either.
Great, thanks for the reply!
@rhagenson – I appreciate that your attention is on other (very important) work, but JOSS does not have an open-ended commitment to you to wait for your updates (this submission is coming up to 2 years old now). As such, I'm going to go ahead and close this review. Should you find time to make these updates in the future we'd welcome a resubmission.
@djmitche @rabdill @marcosvital – many thanks for all of your efforts here.
Submitting author: @RHagenson (Ryan Hagenson) Repository: https://github.com/rhagenson/relped Version: v1.0.0 Editor: @marcosvital Reviewer: @djmitche, @rabdill Archive: Pending
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@djmitche & @rabdill, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @marcosvital know.
✨ Please try and complete your review in the next two weeks ✨
Review checklist for @djmitche
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @rabdill
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper