Closed nldoc closed 5 years ago
Dear @marinapapa and @thomasp85, thank you very much for your helpful and constructive feedback on our R package. We agree with all the issues you mentioned and we revised our code and package documentation accordingly. The following sections will give you a detailed overview of our revisions, pointing to specific commits, where possible and links to the corresponding vignette/article sections of the corresponding package homepage.
I feel the README could be a bit more detailed when it comes to describing what the package does. As it stands, you need to already be familiar with NetLogo to understand the need/use for the package. While a detailed description of NetLogo is not suited for a README, a small section before Installation would make the README more welcome to new users of actor based simulations in general. Further it would be good to link to the NetLogo online ressources to help users with researching further information. The README should also clearly state that you need to install NetLogo manually and how this is done. Omission of this indicates that NetLogo is installed as part of the package installation.
The vignettes assumes the user is running on windows. While it may be impossible to create examples that run on all systems, as you have to point to the NetLogo installation it would be nice if the vignettes were structured in a way that made it easier to modify the code to your local system. This could e.g. be done by storing relevant paths in variables at the top of the vignette.
I would like the Simdesing examples vignette to include more descriptive text to carry the reader through what is happening, and giving them an impediment to understand the steps. As it is now, it is mainly a collections of snippets to cope-paste (or at least it appears like that with the lack of motivation and discussion)
Functions are in general well-documented. I think it would make sense to put .initClasses and simdesign-class documentation as internal so it doesn't show up in the overview, as the docs are rather lacking (and not that important).
All examples appear to be wrapped in \dontrun{}, which I'm pretty sure CRAN will take offense with. I can understand that running the actual simulations may be too heavy for an example, but a lot of the constructors and getters/setters are pretty benign and should be able to be run in examples
\dontrun{}
wrappers have been removed where possible (see commit)As discussed above, the assumption that NetLogo is already installed is not clearly communicated. The download_netlogo() function should be highlighted in the README and "Getting Started" vignette. Apart from this, the installation processes as expected, and the package works as expected. It is clear that you have to have substantial understanding of NetLogo prior to using this package, but as this is also the intended audience it is not really a problem (some additional hand-holding from the documentation would be nice as discussed above). The performance of the package is mainly a function of the performance of NetLogo, and it does not appear that the package adds any unnecessary computational overhead.
The test coverage is generally good, and it is nice to see that they have made a strategy for testing functionality that can't be tested on CRAN
The package title needs to be in titlecase and the description shouldn't start with "The nlrx package..." (CRAN requirements, not mine). Apart from that the packaging guidelines seems to be followed (again, with the excdeption of the documentation discussed above)
In general, I like the idea of the package and its functionalities. Some dependencies issues that were not stated made the review process a bit frustruating, but I think most of my points are very easy to tackle.
- Installation worked in Windows, with the below problems:
- R dependencies: R (>= 2.10) is mentioned in desciption, but two of the Imports packages (lhs and sf) need R >= 3.3. So nlrx could not be installed on a machine with older version.
- Java: the package is using a java virtual mashine so Java needs to be installed on the local machine but is not mentioned anywhere. Installation is fine without it but of course the package will not function, the path set-up is needed in order to run. So it would be helpful if the information for the above points were stated in the installation guidelines/ README/ description.
- A clearer statement of need is needed (as per rOpenSci guidelines), why the package is needed and should be preferred from just using Netlogo's behavioral space. Some of the text from the onboarding issue would fit nicely.
- README:
- Netlogo dependencies: supported Netlogo versions are mentioned only in the _downloadnetlogo function. This and the above prerequests from (1) should be included in README.
- The first paragraph states what is the package doing in a very broad way. As mentioned above, a bit more emphasis can be given on stating the problems it is designed to solve and target audience.
- Example: this session could be renamed as a Get started session and include the example. Also, starting with 'S4 classes' may be a bit too advanced for a student or a not-R-expert to understand, so I would restate the first paragraph in simpler words. The more techincal description is given at the DESCRIPTION file anyway, which I think is appropriate.
- Help files: Given that NetLogo is a program widely used for teaching, I think a bit more detailed help files chould be of great importance:
- nl contructor: what are the experiment and simdesign should be explained more, or have a link to their help pages.
- experiment constructor: should be more detailed of what information the experiment holds (not only with example code).
- simdesign constructor: design types should be listed in the description, not just stated as examples, maybe also with links to the separate functions descriptions.
- _analyzenl:
- typo: fullstop missing in value session.
- this function is based on the simdesign, but given that the latter is stored in the nl object, it is not so obvious for a new user. Maybe it should be emphasized a bit more in the help file and include links to the separate analyze functions.
- All the analyze_X functions have the same help file. If you think re-adding details of the analysis here (cause they exist in simdesign files) is not useful, you should at least add links to these helpfiles. Especially for:
- _analyzesimple: is an empty function, but nothing is stated in its help file.
- _simdesignsimple: even though constant parameterization is stated, the fact that no analysis is available in this design should be mentioned.
- _run_nlone and _run_nlall have the same title and description. A simple adjustment of 'Execute one NetLogo simulation' and 'Execute all NetLogo simulations' will do better.
- _downloadnetlogo: the saving path is stated as: 'Character string with the name where the downloaded file is saved.'. I think it should be replaced by 'path to folder' as in the rest of help me files with parth as input.
- _util_eval__: all functions help files are not very informative, in terms of what 'evaluate' means. However, the comments in the code of the functions are informative: 'check if there are any variables defined', so a simple copy paste should do.
getstarted.Rmd
- library(nlrx) should be called after the installation code
- library(future) needs to be called before lines 100 and 102 can be implemented (otherwise operator %<-% and unquoted multisession are not recognized)
- If Java path is not connected and the user uses "multisession" plan gets an unhelpful error for a non-existing csv file in Temp. For different plans, they get error for Java path missing, which helps. I don't know if you can add something to that, maybe not use 'multisession' in the getstarted?. But anyway not that important.
- Attach an experiment: it should be mentioned if the user has to define all the model variables or the ones that are not defined will be assigned the default model's values.
nl@experiment:
- When assigning variables lists: if you add parameters X and Y as contants as well you get an error: 'Columns X, Y must have unique names', which doesn't make lots of sense. If you know the problem it is clear that it means that the variables should not be defined as contants, but the error message is not helping the user see whats wrong. A message can be added.
- If a parameter is wrong, after running the simulations, the results are going to be empty but no warning or other message will be shown before you try to assign in the results, and will get an error 'replacement has 1 row, data has 0'. It would be helful if there was a hint to the user to check the parameters in the experiment should be given.
spatial-output.Rmd I couldn't get it to fully work. Two main problems:
- line 86: _get_nlspatial runs but prints a very bad message, a repetiting (a lot of times) 'no non-missing arguments to min; returning Infno non-missing arguments to max; returning -Inf', which I don't know where it comes from. With another model without patches information the message does not appear.
- line 137: for _results_spatialtibble, all metrics.turtles are returned NA. I think it comes from the _fulljoin of ddplyr, which only preserves the grouping of x (so pathces) and the not matching values are ignored and replaced with NA. So the turtles cannot be plotted afterwards, getting error 'object 'turtles_x' not found'
turtles_x
is not returned anymore. This is due to different ways agent coordinates can be returned, either x or pc. nlrx now returns this as column name.And minor:
- library(ggplot2) needs to be called earlier, before first plot of spatial output.
- gganimate: also needs a renderer installed. I would mention or add the command for installing 'gifski' package (the default of gganimate), and also the 'png' package installed (which is needed by gganimate). Also, maybe add hashed-out the direct gganimate installation command (instead of just the link).
analyze_simple Renaming to _noanalysis may be helpful (a 'simple' analysis is still an analysis). So far if the user runs _analyzenl with simple design they get NULL, which can be confusing. A message can be added to be printed in _analuzesimple in order to inform the user that 'no analysis can take place if a simple design is chosen'.
write_simoutput
- Maybe a bit trivial, but when the 'out' folder is not created already, it would be helpful to add to the error message a hint that the user needs to create the folder indicated by the path.
- Concerning the output directory, it would be handy for the user to be able to re-assign the output directory when the write function is being called. Maybe keep the default by the nl object, but if a new one is given as input by the user it would be used.
util_runnl.R: There is some dead commented-out code at the end (lines 500 - 650).
load-model-parameters: The name report-model-parameters would be more representative of this function.
- How does the package responds to runtime errors of the actual netlogo model? I think it inputs the error message in the object and doesn't give any warnings. Maybe you can try to catch some main runtime errors of netlogo and print a warning. Aditionally, if the netlogo model is broken the results will run forever (or at least as much patience I had :P ); maybe you could add something to break the function and show a warning message to check the netlogo model. Or state that the package will not respond to problems in netlogo, so the user can always have this in mind.
system.exit()
which would also terminate the current R-session. However, we improved our documentation and added two new sections to the Advanced configuration vignette.
The Handling NetLogo runtime errors section gives information on runtime errors and hints for debugging.
The Capturing progress of model simulations section includes information on how to capture progress of running simulations which might help for debugging and identifying freezed model simulations. The section gives several examples on how simulation progress can be monitored.
- I didn't find it easy to visualize models of collective motion where the heading of individuals is an important aspect. A plotting example of schooling, for translating netlogo's heading metric system to ggplot/gganimate for individual points would be a very valuable addition.
In total, I think it can be a very useful package. As I mentioned above, the package can gain a lot from improving its documentation. Also of course the variety of Netlogo models is very large so also by including another example of a quite different model may add value to it.
Documentation
README
Add More details on package functionality, in particular
(J)
Add small section berfore Installation with general information and links to NetLogo online resources(J)
Add information that Java jdk 1.8 needs to be installed and java path needs to be set correctly(J)
Add information that NetLogo needs to be installed (link to the download_netlogo function)(J)
Add information which NetLogo versions are supported(J)
Clearer statement of why the package is needed (check Ropensci onboarding submission text)Once readme is final -> Copy to GetStarted vignette and to the roxygen header of the nlrx-package.R file
Vignettes / examples
Code Examples
Examples assume that users run windows, instead we could set the relevant path variables at the start of the examples and have unix and windows default paths; in the following examples we can then use these variablesaddlibrary(nlrx)
to code examples after installation code snippetaddlibrary(future)
to code examples before%<%
is usedlibrary(ggplot2) needs to be called in the examples for spatial outputgganimate needs a renderer installed - add comment for installing gifski and png package -> add to spatial output examplesAdd another code example with a different model where heading is important (flocking?) and how to translate netlogo heading (degree scale) to ggplotVignettes
Simdesigns vignette: add more descriptive information on what the simdesigns doRemove technical details (such as S4 classes) from the GetStarted/Readme sectionsexperiment: variables that are not defined in constants or variables will have their default NetLogo value (add to all relevant sections)Add more information on parallelisation to further notes (locally parallelisation over seeds - split function)Documentation
Example code sections are wrapped in \dontrun{} -> remove for functions that are independent of NetLogo (e.g. class getter/setter functions, ...)Package title in titlecaseDescription should not start with "The nlrx package ...." (CRAN requirement)update R-dependency to R >= 3.3 (lhs package and sf require this)Class constructor
nl constructor: explain more detailed what experiment and simdesign are and link to their help pagesexperiment constructor: more detail on information that is stored within the classsimdesign constructor: design types should also be listed in description, not just stated as examples. Include links to simdesign_helper functionsAnalyze_nl
analyze_nl:fullstop missing in value sectionit is currently not clear what the function does exactly because it depends on the chosen simdesign. This could be stated more clearly, with links to the analyze functions.analyze_simple is empty, either remove or add something useful such as a warning message: "'no analysis can take place if a simple design is chosen"simdesign_simple: The fact that currently no analyze_nl function is available should be mentionedOther
run_nl_one and run_nl_all have the same title and description -> revisedownload_netlogo: replace "Charatcer string with..." with "path to folder"utileval: help files are not very informative - comments of these functions could be recylced to create more useful documentation of these functionsPackage functionality
Error messages
when attaching an experiment, add error message if netlogo variable is defined in variables and constants list (eval_experiment?)If a parameter is wrong, after running the simulations, the results are going to be empty but no warning or other message will be shown before you try to assign in the results, and will get an error 'replacement has 1 row, data has 0'. It would be helful if there was a hint to the user to check the parameters in the experiment should be given.Spatial output
line 137: for results_spatial_tibble, all metrics.turtles are returned NA. I think it comes from the full_join of ddplyr, which only preserves the grouping of x (so pathces) and the not matching values are ignored and replaced with NA. So the turtles cannot be plotted afterwards, getting error 'object 'turtles_x' not found'Write_simoutput
add an option (function input) to reassign the default output path of the nl object with another user-defined folderadd an error message if the directory does not exist, with a hint that it needs to be created firstOther
util_runnl: There is some dead commented-out code at the end (lines 500 - 650).load-model-parameters: The name report-model-parameters would be more representative of this function.Add information that for optimizations run_nl_dyn need to be used instead of run_nl_all/one