Open LinguList opened 1 year ago
I can confirm the not so user friendly part of missing URLs to the Zenodo'd files, but the script runs without issue on RStudio for me when the libraries are installed. Here's my session info:
sessionInfo() R version 4.2.2 (2022-10-31) Platform: aarch64-apple-darwin21.6.0 (64-bit) Running under: macOS Ventura 13.0.1
Matrix products: default LAPACK: /opt/homebrew/Cellar/r/4.2.2/lib/R/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] colorspace_2.0-3 rworldmap_1.3-6 sp_1.5-0 lme4_1.1-30 Matrix_1.5-1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.9 compiler_4.2.2 pillar_1.8.1 nloptr_2.0.3 viridis_0.6.2 tools_4.2.2 dotCall64_1.0-2
[8] boot_1.3-28 viridisLite_0.4.1 lifecycle_1.0.3 tibble_3.1.8 gtable_0.3.1 nlme_3.1-160 lattice_0.20-45
[15] pkgconfig_2.0.3 rlang_1.0.6 DBI_1.1.3 cli_3.4.1 rstudioapi_0.14 spam_2.9-1 gridExtra_2.3
[22] dplyr_1.0.10 generics_0.1.3 vctrs_0.4.2 fields_14.1 maps_3.4.0 grid_4.2.2 tidyselect_1.1.2
[29] glue_1.6.2 R6_2.5.1 fansi_1.0.3 foreign_0.8-83 minqa_1.2.4 ggplot2_3.3.6 purrr_0.3.5
[36] magrittr_2.0.3 maptools_1.1-4 scales_1.2.1 MASS_7.3-58.1 splines_4.2.2 assertthat_0.2.1 utf8_1.2.2
[43] munsell_0.5.0
Then maybe it is because the packages are not installed. In this case, I suggest to use something more consistent, like groundhog in R to include packages from a particular date, and to provide a list of packages I need to install (I was doing the dummy-installation). Note also, that in R-Studio this probably cannot work, if you follow the file-name swaps with language_phoible.csv vs. phoible_language.csv
One can install packages by version:
https://search.r-project.org/CRAN/refmans/remotes/html/install_version.html
Personally I've never had a problem with R package versions, which is common in Python.
And yes the file renaming is a bit annoying and unnecessary.
Did not know that, but anyway: one needs to provide (1) the list of packages that need to be installed, and (2) the versions then
In R world it's implicit that the user install the packages that get called by the library()
function. But yes I agree it would be better to write code that explicitly checks and does this in the code for the user.
Or just add a little readme, telling me, as a new user, to install these packages, download those datasets, and then run the code. Following the zen of Python here is crucial for replicable science: explicit is better than implicit.
Hi Mattis,
now I've attended to all the things we talked about. I made the changes directly to https://github.com/Sokiwi/Tone-WordLength/blob/main/tones.R. You should clean away everything you have and start afresh, downloading the revised script. When you go through that you will see changes made such that
the needed packages are installed automatically if you don't have them
information on versions is added (while also mentioning that it is not so important)
download instructions are more precise and consistent (URL works apparently works better than doi's since my ASJP world length data couldn't be found through the doi, perhaps for being a new repo, so now I give URLs first, but also doi's and version numbers).
file renaming is now done by the script
the figures are identical to what was submitted (the script was doing color figures, but I submitted b/w figures and somehow forgot to update the script to do b/w)
I can't (or rather don't want to) control where you download and place files, but I have given instructions about that. The downloaded files should be put in your working directory. This is really the only place where the user can screw up. But it is also the kind of action that a user wants to have control over. I gave some tips about how to manipulate the working directory in the comments.
I hope everything works smoothly now.
Soeren.
On Wed, Jan 11, 2023 at 9:52 AM Johann-Mattis List @.***> wrote:
Or just add a little readme, telling me, as a new user, to install these packages, download those datasets, and then run the code. Following the zen of Python here is crucial for replicable science: explicit is better than implicit.
— Reply to this email directly, view it on GitHub https://github.com/Sokiwi/Tone-WordLength/issues/1#issuecomment-1378419476, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFRPW6ZSS5GEMRATCAFJL4LWRZYFVANCNFSM6AAAAAATV2QJSY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Okay, I'd consider this solved then. @bambooforest, I do not know how to react on the review in Frontiers, but I guess you can do that. I'd probably have to step back from reviewing, now that this script is provided in a somewhat more friendly way.
A last thing: why not place a Makefile into your repository, that will do the download for you? Just make a file, call it Makefile, and then insert in it the following:
download:
curl --output wals-v2020.3.zip https://zenodo.org/record/7385533/files/cldf-datasets/wals-v2020.3.zip?download=1
curl --output "Data-01 ASJP data raw.txt" https://zenodo.org/record/6344024/files/Data-01%20ASJP%20data%20raw.txt?download=1
curl --output phoible-v2.0.1.zip https://zenodo.org/record/2677911/files/cldf-datasets/phoible-v2.0.1.zip?download=1
@Sokiwi -- everything works fine for me now. I think you can close this issue.
@LinguList -- as far as I understand, I need to wait until the other reviewer submits their report and then I can open the interactive review forum. Then @Sokiwi can respond to your review. R2 has until the end of the month to submit.
Fine with me. After this, I am out of reviewing, though, as I cannot judge if the methods applied here are useful. Given that I could help at least a bit to show that some thorougher checking of the code would be useful, I think I did all I can do.
But you could add one thing for the download: a makefile:
download:
curl --output wals-v2020.3.zip https://zenodo.org/record/7385533/files/cldf-datasets/wals-v2020.3.zip?download=1
curl --output "Data-01 ASJP data raw.txt" https://zenodo.org/record/6344024/files/Data-01%20ASJP%20data%20raw.txt?download=1
curl --output phoible-v2.0.1.zip https://zenodo.org/record/2677911/files/cldf-datasets/phoible-v2.0.1.zip?download=1
This gives you all data with one command:
make
@LinguList -- I don't understand your comment above. Are you removing yourself as a reviewer of the paper? The code was always well documented about what analyses are being done:
https://github.com/Sokiwi/Tone-WordLength/blob/507594c0a654c3f92babf02481a17cd8f0df6a48/tones.R#L234
I wrote my review, the author answered it, and I myself cannot provide any more input, as I am not an expert in mixed models and the like. So yes, my job is done. I made the world of documented code that uses CLDF data a bit more consistent, albeit not as consistent as I would've hoped for, bu I cannot do more at this stage.
OK then, thanks for your review of the CLDF use.
I cannot replicate the Rscript.
I use the Rscript function on a linux machine. First, your script has errors in the names for the files you describe, you should correct them as follows (in the first lines):
Then, I run into an error here:
The error says:
I do not understand the error, but I figure there is something wrong in the way you suggest to download and rename the files. Why not just use git to download the repositories and then use relative paths to the respective data points? Renaming is not the best practice here.
Then, you do not provide the Zenodo link for WALS, and you give a DOI for phoible, but a resource link for ASJP, use DOI in both cases.
All in all, you could even write a Makefile that uses GIT to download the respositories. All would prevent these errors.
Can you please tell me what to do with the error and correct the script accordingly? I am reviewing this study on Frontiers and would then proceed from there.