r-spatial / spdep

Spatial Dependence: Weighting Schemes and Statistics
https://r-spatial.github.io/spdep/
121 stars 26 forks source link

Set up package documentation website #23

Closed angela-li closed 5 years ago

angela-li commented 5 years ago

If you'd like, I can set up a package documentation website for spdep similar to the sf package website and the spData package website. Would that be of interest?

If so, would it be possible to create a gh-pages branch for me to work on?

Thank you so much!

edzer commented 5 years ago

Hi Angela, that sounds like a nice idea! The simpler setup is to have everything in a docs directory, and have github serve that as the spdep docs website, rather than the gh-pages branch. (We could go as far as using tic, but I feel that is still high maintenance, and spdep is also not highly dynamic.) Maybe you could make a PR for the docs directory, and give instructions how you created them? Are you thinking of markdown-ing the current Rnw vignettes?

angela-li commented 5 years ago

Ok, sounds good. Just made my first PR (#24) to spdep! Thank you Edzer for the encouragement.

I can attempt to change the Rnws over to Rmd/md so that they're included in the site, but maybe the better idea is to break them up, or even write new articles for the documentation website, maybe following these instructions.

rsbivand commented 5 years ago

Thanks for considering contributing, but do we know what kind of extra documentation is desirable? All documentation is costly in time, and I'm not sure people make adequate use of what is there already. For example, people don't look at the existing documentation of the method= argument to ML fitting functions, do cite Bivand et al. (2013) which describes the argument (how to compute the Jacobian), and still complain that the default ("eigen") doesn't work for larger n.

I'm rather unsure what is the motivation for potentially misleading people by exposing the development documentation rather than the released documentation. The documentation ships with released packages, it is often not what is in the development repository for new or changed functionality. If we need a website (I think it is superfluous unless it can be demonstrated that people who don't read the documentation that ships with packages, which is available offline, and which is the the RS help tab would read the website and understand that the version they are running is not the same as the development version), it also needs maintenance, and I don't see that as a priority. So I'm not really convinced about this - is it a sensible use of limited time, or is it just "nice to have"? Would it make more sense to do say a methods= vignette?

edzer commented 5 years ago

I have never seen an issue of someone being confused by the difference between the devel version docs on the github website and the released software on CRAN - either users understand the difference or the differences are too small. I believe the benefits of pkgdown generated documentation outweigh the risk of this confusion. Package sf and so on now auto-generate docs on each commit; you can also decide to only update docs based on CRAN release (one more thing to remember).

@angela-li do you have an idea how you would ideally see a restructuring of the info in the vignettes?

angela-li commented 5 years ago

Fair points @rsbivand - here are my thoughts on what you're saying:

  1. Extra documentation can take time.
    • Generating the docs folder takes 10 minutes at the most. pkgdown automatically creates the website from existing documentation, which speeds up the process of making the website.
    • As @edzer mentioned, you can even set up spdep with Travis to automatically create the website every time you push new changes. Maybe not something to do now, but if this ends up being a burden, it can be automated.
  2. People don't read existing documentation, so a new website won't help.
    • I believe these are two complementary lines of work. The website can encourage searchability and browsing of existing docs (they often do for me when I'm looking at a package - I find it easier to navigate a website than a PDF doc), and if we find that people don't read existing documentation (and you identify a great instance where people don't seem to get what they need to do for a specific use case) we can think about improving the documentation/function so that it's more easily accessible to new users of the package.
    • Perhaps the workflow is to identify instances where people keep tripping up, and then improving the examples/docs to make it clear to them what options they do/do not have, and what they should do in cases where the defaults break down. You know better than anyone what people have trouble with - we could keep track of them in the Issues.
  3. Users will get confused between the released and the development docs.

I'll open another issue about breaking up/updating the vignettes, or writing new articles for the site. At the very least, something like a "quickstart" could be useful! And something to explain the methods as well.

rsbivand commented 5 years ago

There are two-three candidates for documentation:

  1. The mentioned method= argument to functions now named lagsarlm(), errorsarlm() and sacsarlm(), that is the spdep maximum likelihood estimation functions;

  2. The recently updated and modified Durbin= argument to lagsarlm(), errorsarlm() and sacsarlm(), as well as lmSLX(), that is the functions in Halleck Vega and Elhorst (2015) named GNS, SDM, SDEM and SLX, the ones with WX included. In particular, the one-sided formula interface to the arguments permits subsets of X to be lagged. This came up in a discussion in an ERSA workshop in Vienna in 2016, that spatial lags of dummies may not make sense, so I implemented it to give us something to go on in the maximum likelihood setting.

  3. Very recently, I've fitted in Bayesian estimation for SEM, SDEM, SLM, SDM, SAC and GNS, using the same Durbin= framework, and the internals of Virgilio Goméz-Rubio and Abhirup Mallik's GSoC project in 2011, translating the Spatial Econometrics toolbox code for sar_g, sdm_g, sem_g and sac_g. It's still preliminary, was done for spReg_lag() earlier, and spReg_sem() and spReg_sac() at the last CRAN release. It's still rough, but does have impacts. SEM/SLM have Griddy Gibbs and Metropolis/Hastings for the spatial coefficients, SAC only has Metropolis/Hastings. None have heteroskedastic variance in the code, because it was hard getting it to work, and in the lag case meant doing much more computation in each iteration. I have bits of a script running against Virgilio's SEMCMC implementations, and INLA's "slm" latent model. If you've read this far, might doing something together for the JSS Bayesian SI make sense?

That's enough work for the five years starting 2019, right?

angela-li commented 5 years ago

The website looks great! Thanks all, I'm bookmarking https://r-spatial.github.io/spdep/ right now 😄

rsbivand commented 5 years ago

Good, thanks for taking the initiative on this. Do you know where traffic statistics are kept?

angela-li commented 5 years ago

I think you can set up Google Analytics for the page by adding a Google Analytics tracking id to pkgdown.yml , as described in the pkgdown documentation here.

There is Github Traffic Analytics, but I believe it tracks traffic to the repo, not the .github.io website.

rsbivand commented 5 years ago

OK, probably not a good idea, seems to need JS implants in each page. Might look at vignettes.

rsbivand commented 5 years ago

Another question - @edzer also - I tried to do the same for classInt, but no pages appear - what have I misunderstood?

edzer commented 5 years ago

You didn't enable Settings -> GitHub pages on the repo.

rsbivand commented 5 years ago

Of course ... thanks!