I have now reviewed the manuscript corresponding to the linguiphyr. I am not a linguist and the topic is a bit far from my direct expertise. Nonetheless, I provide some suggestions in the hope they will help improving the manuscript and the package.
Installation and functionality
[ ] I had some difficulties with installing the package following the instructions on the readme page. The error was caused by two dependencies: ‘httpuv’ and ‘shiny’. I got the message ‘Do you want to install from sources the package which needs compilation?’. When I pressed ‘yes’ the installation failed, when I pressed ‘no’ installation was successful. You may consider adding a note about this in the readme.
[ ] In the readme it is mentioned that the example dataset is stored in the data folder. I have now manually downloaded this file from the github page. Would it be an idea to store this dataset within the package? You could for example show in the readme how to export the built-in dataset to a local folder, or add an option in the shiny app where the user could select ‘example dataset’.
The interface itself runs smoothly and is intuitive.
Manuscript
[ ] In the opening sentences of the introduction several references are made to mathematical, computational and statistical. Perhaps double-check whether all these terms are needed.
[ ] L60. The authors write no coding is needed. While linguiphyr provides a GUI, one of the benefits of running this from within R is that users may use outputs for further analysis, or do data preparation from within R. Or are do you mean something else with ‘coding’?
[ ] L61: ‘we believe phylogenetics can only be useful to historical linguistics if considerable analysis is given to the results of phylogenetic algorithms by linguists’
[ ] L63. You write ‘complicated code’. Perhaps be more specific here by what you mean with this. Starting linguiphyr from within R will also require some coding, or at least some basic knowledge of programming.
[ ] L66: “Trees are then displayed in the app and can be downloaded for inclusion in other work.” Here it might be relevant to mention interoperability with other packages. For example, how data can be prepared in other packages, or how they could be used for visualizing trees.
[ ] L68 The author writes ‘These focus on the following questions’. The first two of these questions are also mentioned on line 53-54. I think it’s more clearly explained there. I am doubting whether a GUI approach would be most suitable for ‘making linguistic phylogenetic analysis reproducible’. If that is the aim, it might be an idea to also facilitate exporting the selected settings that were used to generate a particular tree (e.g. as a .CSV).
[ ] L69. The sentence ‘What is the effect of particular coding’. I find this sentence not very easy to understand. By ‘coding’ do you mean how particular features are coded? Perhaps it would be good to clarify this earlier on in the manuscript (and give an example).
[ ] L91. ‘discussion on language common in’ , perhaps rephrase to ‘terms common in’?
Further suggestions
[ ] I assume the columns in the example dataset represent languages. Would it be an idea to use glottocodes or isocodes here? This would facilitate interoperability with other datasets and packages.
[ ] References do not show up on the github page (both in-text and reference list). But this may be automatically generated when compiling the paper?
[ ] Now the installation of PAUP goes from the browser. Would it be an idea to do that from a prompt from within R? I.e. a prompt when starting linguiphyr for the first time? It would perhaps also be helpful to add a built-in check for the latest version of PAUP and check whether this corresponds to the locally installed version.
Good luck with the revisions!
Kind regards,
Sietze
Thank you so much for all of the great advice and suggestions! I have now addressed all of these as requested. Some notes below.
Installation and functionality.
You are the first with this issue - I have now added a note to the README!
Excellent suggestion! I have now added a button in the app to use the example dataset, as well as a button to download it as a csv. So it is now all streamlined in the app and one doesn’t have to look in a folder or in the git repo.
Manuscript
I have reduced the usage of these terms.
L60: no, you’re right, this is what I mean by coding. But, I would add the point is that one doesn’t need to code to be able to do these analyses by using this app - prior to this, one would either have to write code to use parsimony packages in R (or similar language), or write Nexus configuration files for something like PAUP*. If one does want to perform more advanced analyses (as I have had some users ask me), we provide the option to download the Nexus configuration file, where you can then edit it by hand.
L61: I have rewritten this sentence to read: “While these are useful skills, giving linguists the option to spend their time analyzing trees in a GUI rather than writing code will facilitate analyses of phylogenetic inferences”
L63: I have removed this sentence altogether (see (3) above).
L66: I added “(either as images or as Nexus files)” to clarify the formats that you can download the results.
L68: The line about reproducibility in my draft is actually commented out, a decision I made for the exact reason you listed. So, I totally agree with your suggestion of being able to download configurations used to generate the trees so others can do the same. This is already available to some extent in the ability to download the Nexus files; however, it would be nice to do more in the future.
L69: I have now provided an example for “the effect of a particular coding” on l. 209, where this particular feature is explained in more detail. Thanks for the suggestion!
Done!
Further suggestions
One certainly could use glottocodes or isocodes here! It really doesn’t matter. In my work with linguists, they have either used IE clade names (e.g. “Baltic”, “Slavic” - so not actual languages per se) or often highly specific dialects of a certain language family. But if one wanted to use glottocodes or isocodes that would not be an issue.
Right, they show up automatically when compiling the paper, which you can see in “Actions” on the github page.
I have added a modal error message with the PAUP installation instructions (more or less duplicated from the README) when PAUP is not found - this seems to make diagnosing problems much easier. In the long run, I would definitely like to make PAUP more streamlined within the app as you say - the PAUP website says “Official open-source release of the program is getting very close.” - so once that’s done I hope to just tie it into the installation without requiring the extra step.
Dear @marccanby,
I have now reviewed the manuscript corresponding to the linguiphyr. I am not a linguist and the topic is a bit far from my direct expertise. Nonetheless, I provide some suggestions in the hope they will help improving the manuscript and the package.
Installation and functionality
The interface itself runs smoothly and is intuitive.
Manuscript
Further suggestions
[ ] I assume the columns in the example dataset represent languages. Would it be an idea to use glottocodes or isocodes here? This would facilitate interoperability with other datasets and packages.
[ ] References do not show up on the github page (both in-text and reference list). But this may be automatically generated when compiling the paper?
[ ] Now the installation of PAUP goes from the browser. Would it be an idea to do that from a prompt from within R? I.e. a prompt when starting linguiphyr for the first time? It would perhaps also be helpful to add a built-in check for the latest version of PAUP and check whether this corresponds to the locally installed version.
Good luck with the revisions! Kind regards, Sietze