Issues with paper - Githubissues

SimonGreenhill commented 7 months ago

Some comments on the manuscript in https://github.com/marccanby/linguiphyr/blob/main/paper/paper.md?plain=1

As a proviso, I'm a proponent of a very different framework (Bayesian approaches) and my comments should be read through this lens. I have, however, tried not to be too partisan in my suggestions below.

[x] L30: "undertaken by statisticians" - Please reword, this is incorrect. I only know of one statistician publishing language phylogenies. If you look here most authors are linguists, followed by biologists or computer scientists.
[x] L32-33: "graphical user interface (GUI)" - should be capitalised if you're using it as an acronym.
[x] L42: "easily interpretable by linguists" - because linguists can't understand likelihood or Bayesian approaches? please reframe. Perhaps just "relatively simple interpretations" or "conceptually simple" (although I note that maximum parsimony is not as simple as it appears on the surface for many reasons.
[x] L46: "tree search. Other concerns about fully parametric approaches have been raised as well, such as the suggestion that non-parametric methods like parsimony are more accurate [@barbancondiachronica2013; @tutorialNicholsWarnow; @holmes2003statistics]. " -- this is an unfair reading of the literature. I've argued in a number of papers that Bayesian methods are far superior to MP for language trees, but there are (more importantly) long-standing issues with MP in terms of it being statistically problematic (e.g. Felsenstein 1978 Syst. Zool, Steel & Penny 2000, Mol. Biol. Evol.). Nor is it correct to say that MP is non-parametric, it has an implicit model behind it (Steel and Penny again).
[x] L56: "<!---Currently, the de-facto"... it's a shame this is commented out as it makes a good point. I wouldn't, however, frame this as a 'standard', just that there's a set of tools in use. Some are parsimony based, some are likelihood, or Bayesian. There are dedicated packages for Bayesian methods out there and tutorials for them (e.g. https://doi.org/10.1093/jole/lzab005 or your citation to IndoEuropeanphylogeneticswithR), but MP is harder to use.
[x] L64: "Over-emphasis on technical ability often hinders this work." -- is "over-emphasis" the right term? Who is emphasising this?
[x] L112: "which can be standard, irreversible, or custom" -- should explain these.
[x] L118: this coding scheme would also work for structural/typological data which is often used. It would be good to mention this.
[x] L124: "An abundance of literature discusses good methodology for doing this [@ringe2002indo; @tutorialNicholsWarnow]." -- would it be better to cite some standard historical linguistics references here?
[x] L152: probably worth noting why this might not hold (e.g. if any one language in the clade has lost this cognate or evolved a new cognate then this will not hold).
[x] L154: the commented out information here looks useful. Can it be added back?
[x] L158: "integer" - given the emphasis of this package for non-computational uses, "integer" is probably better as "numeric"
[x] L178 etc: it would be helpful to have citations for e.g. "compatibility" so users can track down what these mean if they want.

Bigger questions and remarks:

[x] is there any reason you don't allow likelihood approaches? paup* handles these quite happily set criterion=likelihood;
[x] I would like to see some more justification of PAUP. Many parsimony and distance algorithms etc are implemented directly in R packages like phangorn or ape -- is there a reason to shell out to PAUP? I can see efficiency being a criterion (PAUP* will blow R out of the water) but the type of user using this package is unlikely to be using data where computational efficiency will become critical.

marccanby commented 7 months ago

I thank the reviewer for his helpful feedback. I have updated the paper in response on branch "1-issues-with-paper". Below are my responses to each comment:

Some comments on the manuscript in https://github.com/marccanby/linguiphyr/blob/main/paper/paper.md?plain=1 As a proviso, I'm a proponent of a very different framework (Bayesian approaches) and my comments should be read through this lens. I have, however, tried not to be too partisan in my suggestions below. L30: "undertaken by statisticians" - Please reword, this is incorrect. I only know of one statistician publishing language phylogenies. If you look here most authors are linguists, followed by biologists or computer scientists. Response: This point is well-taken, and I shall revise the sentence as follows:

However, much of the work is highly technical and not easily accessible to the typical classically-trained historical linguist; this is largely (and understandably) due to the highly mathematical and computational nature of the work.

L32-33: "graphical user interface (GUI)" - should be capitalised if you're using it as an acronym. Response: The acronym, GUI, is capitalized, but I have followed the Oxford English Dictionary (OED) in not capitalizing the term “graphical user interface” itself: https://www.oed.com/search/dictionary/?scope=Entries&q=gui.

L42: "easily interpretable by linguists" - because linguists can't understand likelihood or Bayesian approaches? please reframe. Perhaps just "relatively simple interpretations" or "conceptually simple" (although I note that maximum parsimony is not as simple as it appears on the surface for many reasons. Response: I do not mean to imply that linguists are unable to understand likelihood or Bayesian approaches. I shall revise the sentence to remove “by linguists”.

L46: "tree search. Other concerns about fully parametric approaches have been raised as well, such as the suggestion that non-parametric methods like parsimony are more accurate [@barbancondiachronica2013; @tutorialNicholsWarnow; @holmes2003statistics]. " -- this is an unfair reading of the literature. I've argued in a number of papers that Bayesian methods are far superior to MP for language trees, but there are (more importantly) long-standing issues with MP in terms of it being statistically problematic (e.g. Felsenstein 1978 Syst. Zool, Steel & Penny 2000, Mol. Biol. Evol.). Nor is it correct to say that MP is non-parametric, it has an implicit model behind it (Steel and Penny again). Response: I agree with the reviewer that more could be said in favor of Bayesian methods, as he is right that they have many benefits not shared by parsimony – and I don’t mean to bash Bayesian methods anyway – I would even like to incorporate them in the future. Because I don’t think the MP vs. Bayesian debate is the point of this paper, I propose to rewrite the entire paragraph (ll. 41-47) as follows:

We note that, at present, our software focuses on parsimony-based tree estimation and analyses. We make this choice because such an approach is easily interpretable: the best tree is simply the tree that minimizes the number of state changes. This makes it easy for linguists to see the effect of each character in the dataset on tree search. However, one limitation of parsimony-based methods is that they are limited to searching for and analyzing tree topology: studies seeking to explore ancestral node dating (glottochronology) or branch lengths are better suited to using likelihood or Bayesian approaches. Future work will include the incorporation of other search algorithms and analytical methods into LinguiPhyR.

L56: "<!---Currently, the de-facto"... it's a shame this is commented out as it makes a good point. I wouldn't, however, frame this as a 'standard', just that there's a set of tools in use. Some are parsimony based, some are likelihood, or Bayesian. There are dedicated packages for Bayesian methods out there and tutorials for them (e.g. https://doi.org/10.1093/jole/lzab005 or your citation to IndoEuropeanphylogeneticswithR), but MP is harder to use. Response: Having done extensive studies comparing parsimony and Bayesian inference methods on simulated data (results in a separate forthcoming publication), I respectfully disagree that MP is harder to use: there are no hyperparameters to set, it is much faster to run, and we don’t have to deal with convergence. However, I shall include a revised version of this sentence taking the reviewer’s feedback into account:

Currently, the go-to method for phylogenetic analysis is Bayesian inference, which, despite efforts to reduce barrier to entry, requires reasonable mathematical maturity to understand and operates largely as a black-box.

L64: "Over-emphasis on technical ability often hinders this work." -- is "over-emphasis" the right term? Who is emphasising this? Response: I shall revise this sentence to the following:

Giving linguists the option to spend their time analyzing trees in a GUI rather than writing complicated code will facilitate this work.

L112: "which can be standard, irreversible, or custom" -- should explain these. Response: I shall add a paragraph explaining these:

Each character may also be declared “standard”, “irreversible”, or “custom”. Standard characters permit any change of state (e.g. from 0 to 1 or from 1 to 2) with uniform cost. This is generally appropriate for lexical characters where the states represent cognate classes. Irreversible characters are binary characters that may transition from 0 to 1 but not from 1 to 0. This is appropriate in the case of phonological mergers, which are generally considered irreverisble. Finally, custom characters allow the user to declare which state transitions are allowed, and what the cost should be for each permitted transition. The exact way to specify this is described in the “Data Upload” page of the app.

L118: this coding scheme would also work for structural/typological data which is often used. It would be good to mention this. Response: I shall add a sentence at the end of l. 124 as follows:

Our coding scheme is also applicable to phonological, morphological, and structural/typological characters, which are abundant in phylogenetic datasets.

L124: "An abundance of literature discusses good methodology for doing this [@ringe2002indo; @tutorialNicholsWarnow]." -- would it be better to cite some standard historical linguistics references here? Response: I will update this statement as follows:

An abundance of phylogenetics literature discusses good methodology for doing this [@ringe2002indo; @tutorialNicholsWarnow, heggarty2021cognacy]; classical historical linguistics references are also helpful [ringeska, campbell2013historical].

L152: probably worth noting why this might not hold (e.g. if any one language in the clade has lost this cognate or evolved a new cognate then this will not hold). Response: Great suggestion! Here is some added text:

However, it is important to note that a clade on a particular tree may be supported by more than just the characters that meet this condition. For example, if the dominant cognate class in a clade is lost by just one language in the clade, the character will support the grouping if the removal of the edge separating the clade from all other languages would produce a less parsimonious tree. This can be examined in the “Analysis” page of the application.

In responding to this, I also realize that l. 189 should have a clarifying sentence:

A character is deemed to support an edge if and only if the edge’s collapse increases the parsimony score for that character.

L154: the commented out information here looks useful. Can it be added back? Response: This material is described earlier in the text (ll. 110-113), and even further by the paragraph about character types I have written in response to another comment by the reviewer. I think that with the inclusion of that text, it is unnecessary to include these sentences.

L158: "integer" - given the emphasis of this package for non-computational uses, "integer" is probably better as "numeric" Response: Agreed, but (see previous comment) I favor leaving these few sentences out because they are described elsewhere in the paper, and the information about data types is not necessary for this writeup, which mostly addresses higher level issues (the data types are described on the “Data Upload” page of the app).

L178 etc: it would be helpful to have citations for e.g. "compatibility" so users can track down what these mean if they want. Response: I shall clarify these terms as follows (with references where relevant):

The compatibility score is the total number of characters that evolve on the tree without homoplasy (see warnow2017computational for further detail). To calculate total edge support and minimum edge support, we first calculate the number of characters that enforce, or support, each edge, based on whether or not the collapse of that edge would increase the parsimony score. Total edge support is the sum of these support values across all edges, and minimum edge support is the minimum of these values.

Bigger questions and remarks: is there any reason you don't allow likelihood approaches? paup* handles these quite happily set criterion=likelihood; Response: This is a fantastic feature suggestion for this app. I decided not to be too ambitious for the first iteration of this app, because I want to make sure that it is well-tested and reliable for the conditions it claims to cover. I would very much like to add distance-based methods, likelihood methods, and perhaps even simulation utilities. Some of this is addressed in the last sentence of the paper about future work, but I don’t want to promise too much in the writeup.

I would like to see some more justification of PAUP. Many parsimony and distance algorithms etc are implemented directly in R packages like phangorn or ape -- is there a reason to shell out to PAUP? I can see efficiency being a criterion (PAUP will blow R out of the water) but the type of user using this package is unlikely to be using data where computational efficiency will become critical. Response: This is a great point that I myself have wrestled with in the development of this software, because it certainly would be convenient to not require the user to download PAUP. In fact, I use castor’s asr_max_parsimony function in the analysis code of the app, so I am very aware of the usefulness of these functions. However, for tree search, it is very necessary to use a reliable and well-optimized package, due to the huge search space. The best package for this is PAUP*. I disagree with the claim that “the user using this package is unlikely to be using data where computational efficiency will become critical” – I am actually working with a linguistics graduate student who has this exact issue. Furthermore, many new datasets (such as IECor) have many dozens of languages – which makes it even more necessary to use optimized software when it comes to tree search.

SimonGreenhill commented 7 months ago

ok, I'm happy with all of these responses. Thanks!

marccanby / linguiphyr

Issues with paper #1