tahiri-lab / aPhyloGeo

aPhyloGeo is a multiplatform application for the analysis of genetic and climatic conditions.
https://pypi.org/project/aphylogeo/
GNU General Public License v3.0
9 stars 6 forks source link

Comments on Manuscript #49

Closed mmore500 closed 2 weeks ago

mmore500 commented 3 months ago

Part of JOSS review https://github.com/openjournals/joss-reviews/issues/6579

typo/grammar: "between a genetic of species and its habitat during the reconstruction"

Statement of Need

State of the Field

Figure

Pipeline:

Multiprocessing:

What windows are you referring to?

Dependencies:

Citations to the software described would be appropriate.

Conclusion:

Overall:

As is, the reader would greatly benefit from attention to the concreteness/specificity of description of the library features and applications A specific application example or case study would greatly benefit the clarity of the manuscript.

hazem-dev commented 2 months ago

Thank you for your insightful review and constructive feedback. Your comments were instrumental in guiding the revisions we made to the manuscript. Here's a summary of how we've addressed your specific concerns:

  1. Introduction and Statement of Need: We've sharpened the focus and articulated the core scientific questions our research aims to answer.
  2. State of the Field: The terminology related to "topological agreement" has been refined for greater clarity and precision.
  3. Figure and Pipeline: We've improved the figure with clearer labels and enhanced readability. Explanations of each pipeline step and the use of the YAML file have been added for a deeper understanding of the process.
  4. Multiprocessing, Dependencies, and Conclusion: We've incorporated your feedback on window types, software citations, and refined the concluding remarks for a stronger finish.

In addition, we emphasize the real-world applications of aPhyloGeo in several key areas:

To further showcase the practical utility of aPhyloGeo, we're actively using it in our ongoing project, iphylogeo++, available at GitHub. We sincerely appreciate your time and expertise. Please don't hesitate to share any additional thoughts or suggestions you may have.

mmore500 commented 2 months ago

Some further comments on the manuscript:

Smaller comments:

hazem-dev commented 2 months ago

Thank you, @mmore500, for your detailed and insightful feedback. We appreciate your continued efforts to improve our manuscript. We have addressed your concerns as follows:

  1. Word Count Reduction: We have significantly revised the manuscript to reduce redundancy, tighten language, and remove less crucial details. We are confident that the revised manuscript meets the JOSS word limit requirements.

  2. "State of the Field" Revision: This section has been retitled and references have been updated to include more non-self citations, providing a broader perspective on the field.

  3. Figure Revisions: We have increased the font size on the figure to ensure legibility. The Python and BioPython logos have been removed to avoid any implication of direct affiliation.

  4. Taxonomic Unit Clarification: The term "genetic trees" refers to phylogenetic trees constructed from genetic data, where the taxonomic units are typically populations or species, not individual genes. This has been clarified in the text.

  5. "Climate Trees" Justification: Constructing climate trees and performing tree distance comparisons provides a visual and quantitative way to assess the overall congruence between genetic relationships and climate similarities, complementing direct correlation tests and offering additional insights.

  6. Confounding Spatial Effects: We acknowledge the potential confounding effects of spatial proximity on both genetic relatedness and climate similarity. Currently, our analyses do not explicitly control for spatial effects. We have added a discussion of this limitation and potential future directions for addressing it.

  7. Dependency Clarification: The dependencies listed in the manuscript but not in setup.py are optional and not strictly required for end-users. We have clarified this in the text and removed specific version numbers.

  8. Minor Comments: We have addressed all the minor comments, clarifying vague language, revising verb tense, and adding links to relevant GitHub content at specific commits.

We added iphylogeo++ ref into our manuscript, as suggested.

We believe that these revisions have significantly strengthened the manuscript and addressed all of your concerns. We are grateful for your comments and look forward to your further feedback.

mmore500 commented 1 month ago

Thank you for your comments.

I had the opportunity to look through the revised manuscript, and some significant improvements have been made. I opened #52 with some minor grammar and vocabulary suggestions.

To continue with some specific points,

Figure Revisions: We have increased the font size on the figure to ensure legibility. The Python and BioPython logos have been removed to avoid any implication of direct affiliation.

I can see that the external logos have been removed. However, I remain concerned with the font sizes, which do not seem to have significantly changed. Here's a screen grab of a previous draft beside the current draft.

Screenshot from 2024-07-05 21-33-34

Ideally, the text in the figure should (at smallest) not be appreciably smaller than the figure caption font size. The fact that all font sizes (e.g., including major titles) are much smaller than the caption font significantly contributes to the legibility issue here.

Confounding Spatial Effects: We acknowledge the potential confounding effects of spatial proximity on both genetic relatedness and climate similarity. Currently, our analyses do not explicitly control for spatial effects. We have added a discussion of this limitation and potential future directions for addressing it.

I did not find this in the manuscript.


Some additional comments. Quoting now from the manuscript itself,

The methods validate input parameters to ensure data accuracy and prevent errors (refer to the YAML file).

It is unclear what would be found in "the YAML file" without clicking through to see that it is a listing of parameters.

Emphasizing software development best practices and open-source principles (e.g., iPhyloGeo++), aPhyloGeo ensures reliability and sets the stage for ongoing innovation.

Based on our discussion, I understand that the reference to iPhyloGeo++ is meant to demonstrate a "Project Using the Software" to evidence impact/adoption. In the actual manuscript, it is unclear why iPhyloGeo++ is mentioned.

As an additional comment, this sentence has a lot of buzzwords but does not communicate a point that is specific and tangible. This pattern arises (to varying degrees) throughout the manuscript. This has improved somewhat in recent drafts, but I would encourage a careful pass through the text to consider this issue.

Sequences with notable variability were specifically retained for analysis.

The tense of this sentence makes it sound like you are describing a specific project where an analysis was applied, instead of a general tool for analysis.

aPhyloGeo employs algorithms using metrics like least squares (Felsenstein, 1997), Euclidean, and Robinson-Foulds (Robinson & Foulds, 1981) distances to ensure statistically sound correlations.

Why are these details relevant to mention in the abstract? Consider clarifying or deleting.

In the comparison of phylogenetic trees, which are constructed based on genetic data, with climatic trees, a crucial step involves applying a phylogeography approach.

This is the only place that climatic trees are mentioned in the manuscript. Based on previous drafts of the manuscript and our discussion, I understand this concept to be central to the aPhyloGeo workflow. I am under the impression that the "climate tree" is novel methodology that your group has been involved in developing, and so is likely unfamiliar to much of the potential readership. This method should be briefly defined, explained, and justified. A reference to a publication introducing or applying the method would be highly beneficial.

Additional comments

Please double-check your links by clicking through all of them in the PDF. Some appear to be dead in the current draft.

There appear to be distinct comparison methods for trees and for sequences. This could be clarified by specifying more explicitly what is being compared in the "Similarity Methods" section.

TahiriNadia commented 1 month ago

Thanks, @mmore500, for your reviews. We will address them point by point.

TahiriNadia commented 1 month ago

I had the opportunity to look through the revised manuscript, and some significant improvements have been made. I opened https://github.com/tahiri-lab/aPhyloGeo/pull/52 with some minor grammar and vocabulary suggestions.

Done :white_check_mark: Thanks.

TahiriNadia commented 1 month ago
To continue with some specific points,

Figure Revisions: We have increased the font size on the figure to ensure legibility. The Python and BioPython logos have been removed to avoid any implication of direct affiliation.

I can see that the external logos have been removed.
However, I remain concerned with the font sizes, which do not seem to have significantly changed.
Here's a screen grab of a previous draft beside the current draft.

Screenshot from 2024-07-05 21-33-34
![image](https://github.com/tahiri-lab/aPhyloGeo/assets/19578926/6869a106-d975-4ecf-b1eb-607fc5b0b023)

Ideally, the text in the figure should (at smallest) not be appreciably smaller than the figure caption font size.
The fact that all font sizes (e.g., including major titles) are much smaller than the caption font significantly contributes to the legibility issue here.

Done ✅. We completely updated the new figure (see below) with particular attention to the text size. It is now more accessible. Thanks.

image

hazem-dev commented 1 month ago

@mmore500 ,

Thank you for your thorough and insightful review of our manuscript. We greatly appreciate the time and effort you have invested in providing valuable feedback.

We have carefully addressed each of your comments in our revised manuscript. Below, we provide a detailed response to each of your points and describe the corresponding changes made:

  1. Confounding Spatial Effects: We have now included a section discussing the potential confounding effects of spatial proximity on both genetic relatedness and climate similarity. This limitation and possible future directions for addressing it, such as incorporating spatial autocorrelation analysis and spatial regression models, are now explicitly mentioned under the "Spatial Proximity Confounding Effects" section.

  2. YAML File Reference: We have clarified the contents and purpose of the YAML configuration file. The methods section now details the types of parameters listed in the YAML file, providing a clearer understanding without the need to click through the link.

  3. Reference to iPhyloGeo++: The reference to iPhyloGeo++ has been rephrased to clearly illustrate its significance as a "Project Using the Software" to evidence impact and adoption. This section now underscores the practical application and impact of aPhyloGeo in phylogeographic research, enhancing its context and relevance.

  4. Buzzwords and Specificity: We have conducted a careful review of the manuscript to eliminate overly broad statements and buzzwords.

  5. Sentence Tense: The sentence "Sequences with notable variability were specifically retained for analysis" has been revised to "Sequences with notable variability are specifically retained for analysis" to reflect the general applicability of the tool rather than a specific project.

  6. Abstract Details: The details regarding the algorithms employed by aPhyloGeo have been streamlined in the abstract to focus on the core functionality and benefits of the software.

  7. Climatic Trees Explanation: We have introduced a brief definition and explanation of climatic trees, a novel methodology central to the aPhyloGeo workflow. This includes a description of how climatic trees are constructed and their significance in phylogeographic analysis.

  8. Link Verification: We have thoroughly checked all links in the manuscript to ensure they are functional and lead to the correct resources.

  9. Clarification of Comparison Methods: The distinction between tree and sequence comparison methods has been clarified in the "Similarity Methods" section.

Thank you once again for your valuable comments and for considering our revised manuscript. We look forward to your favorable response.