Open jhollist opened 8 years ago
hey jeff - I vote for continuing to work on it, but not submitting as a CRAN task view - it's a great resource and people will find it when googling
I agree that we should keep it going. It is a nice resource.
Is there any merit to redoing the format? Straight up markdown, as opposed to .ctv
would be a bit easier to have others contribute. Could also do it as a web page, although not sure there is much benefit to that over the repo itself.
We should at minimum have links to all rOpenSci task view https://ropensci.org/packages/, no? We could possibly ingest the task views to display them on the web site, but that seem like more work that it's worth.
I think there might be something to be said about creating a GitHub-native format similar to Task Views (perhaps like the "awesome" lists), as well. The number of packages on CRAN is too big to really be able to have up-to-date Task Views that have complete coverage while also maintaining the high visibility that comes from having a small number of Task Views. It'd be much easier to create and maintain narrower Task Views on GitHub, where there might only be ~10 or so packages on a topic. But then they're not easily discoverable unless they all live in one place/GH organization. Something to think about...
where there might only be ~10 or so packages on a topic
does that mean you'd prefer to split up web tech & open data? or did you mean could have as little as 10?
I think we could do that, but more I think it would be useful to have a TV-like format that works for smaller topics (I.e., not enough for a Task View but worth having an up-to-date page about).
If I may drop two cents:
I would definitely keep the maptools task view. It is a great resource. I think it's primary advantage over existing CRAN Task Views (CTV) and awesome lists (AL) is that it is focused. IMHO some of the CTVs are too broad (e.g. webtools) and it is even more the case with a lot of ALs.
The purpose of CRAN Task Views was to have curated lists of packages that:
As @leeper wrote, we have now so many packages for so many purposes that the principle of having a limited number of CTVs becomes impractical. R evolves and specializes: there are more and more tools addressing relatively narrow field of application. In that sense, ad (1), it would be good to CTVs to become more fine-grained. Take interacting with resources on the Web as an example. Some years ago there were only download.file
, url
connection, and XML
package for that, and look where we are now. Keeping (2) up to date is more difficult as the View grows, unless you are able to mobilize package authors to keep their package descriptions current. Functionality (3) is I think not used by many people, and it is even less usefull if a CTV contains hundreds of packages. See:
vs <- ctv::available.views()
sapply(vs, function(v) structure(nrow(v$packagelist), names=v$name))
Bayesian ChemPhys ClinicalTrials Cluster
115 84 46 102
DifferentialEquations Distributions Econometrics Environmetrics
22 185 121 109
ExperimentalDesign Finance Genetics Graphics
63 141 31 42
HighPerformanceComputing MachineLearning MedicalImaging MetaAnalysis
83 80 28 71
Multivariate NaturalLanguageProcessing NumericalMathematics OfficialStatistics
124 36 69 72
Optimization Pharmacokinetics Phylogenetics Psychometrics
97 8 76 133
ReproducibleResearch Robust SocialSciences Spatial
70 51 83 142
SpatioTemporal Survival TimeSeries WebTechnologies
58 203 196 155
gR
36
and that does not count dependencies of packages in a CTV.
I was also thinking about the format, because the writing CTV XML by hand is cumbersome, especially with having currently (R)Markdown around. The main purpose for the CTV format was to be able to
a. Easly display it on the web. b. Construct the package list automatically.
Both of these goals can be accomplished with RMarkdown, which would even allow to embed images, which are not allowed at this moment. Perhaps Achim Zeileis responsible for CTVs would be open to some new ideas...
@mbojan
Provide the user the possibility to install all the packages in CTV with a single function call.
Do you think people do this anymore?
Perhaps Achim Zeileis responsible for CTVs would be open to some new ideas...
I'd be surprised if they were open to Rmd, but doesn't hurt to ask - they did allow markdown vignettes in pkgs
Seems like it should be possible to generate the ctv XML format from an RMarkdown source file, no?
@cboettig That is what we currently do for webservices and opendata. It's a simple markdown file that is pandoc
-ed (+ a little bit of R-ed) into the CTV format.
And maptools was modelled after those and uses the same approach.
I'd also be interested in what others think about how often people "install" a CTV? Would it be useful to have an install.views for CTVs on GitHub?
On Mon, May 30, 2016 at 12:59 PM, Thomas J. Leeper <notifications@github.com
wrote:
@cboettig https://github.com/cboettig That is what we currently do for webservices and opendata. It's a simple markdown file that is pandoc-ed (+ a little bit of R-ed) into the CTV format.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ropensci/unconf16/issues/42#issuecomment-222528098, or mute the thread https://github.com/notifications/unsubscribe/AFL8S5o7boIiK7cNDOVW01Keupu06FIlks5qGxd_gaJpZM4H9JNd .
Jeff W. Hollister email: jeff.w.hollister@gmail.com google voice: 401 326 2531 cell: 401 556 4087
I didn't even know it was possible to install a task view! And the thought of it makes me queasy.
But @jhollist's idea to support installation from a CTV on GitHub seems like a clever way to define and facilitate installation of a constellation of packages, e.g. for a workshop.
@sckott, I would be surprised if anybody install whole CTVs nowadays. While I can imagine some Pharmakinetist (?) installing 8 packages (+dependencies). I can't imagine installing 141 packages (+dependencies) even if you are a financial analyst. The Distributions
CTV grouping functions modeling different families of statistical distributions is perhaps useful as a reference, but nobody will probably ever need to install and use all, even if you are a Bayesian fundamentalist. Anyway, I don't think there is any data on CTV usage unfortunately, from CRAN web stats or otherwise.
I was playing around with a Rmd document with some dedicated YAML fields and |package_name|
syntax for package names, A piece of R code converts it to a .ctv
file collecting the package names from ||
. I did not go very far implementing it though.
Github-hosted, RMarkdown-based TaskView that could trigger installation of packages would be something useful I think. Indeed quite useful in the context of workshops etc. (I did not think of that!). Definitely more lightweight than creating a "metapackage" with the wanted packages as dependencies, and more "secure" than sourcing R script from the web that might contain system("rm ~")
or some similar kind of joke.
If you want to install all packages from a task view, then what you really want is for the task view itself to be a package with all of its listed packages as dependencies, right?
@leeper I guess you could just create a package with the task view document as a vignette, and have all the packages listed in Suggests
field in DESCRIPTION
. Then
install.packages(..., dependencies=TRUE)
to have the suggested packages installed rightaway.packageDescription
.The view mechanism seems lighter, which I like. Just put a CTV-like file in a GH. Read it on GH. Install the packages by calling something like install_view("leeper/myview")
that would fetch the file, get the package names and install them. That might even allow the view to include packages that are not on CRAN, but on GH etc. I think that actually might be very nice for @jennybc case with workshop-like setups. We might also have something like update_view()
that would just update the view-related packages.
As the view in principle does not contain any R code, wrapping it as an R package seems a bit of an overkill to me...
A CRAN-compliant package can be just a DESCRIPTION
file and a (possibly empty) NAMESPACE
file. Even if it contains no code, documentation, tests, or vignettes, it is still installable and therefore provides a mechanism for Depends
/Imports
/Suggests
installations via install.packages()
, install_github()
, etc. without needing to write a new package to handle that.
@leeper sure, that's what I meant in the "package solution" above. In fact, there are such "metapackages" on CRAN, e.g. statnet.
Another functionality available with CTVs which would be difficult to mimic with a package is that you can mark some of the packages as "core". When installing the view you could choose whether to install all or just the core. With a package and existing functions it is all or nothing.
But I guess we are somewhat drifting away from the original topic of this thread...
Over the last year or so I've been maintaining the https://github.com/ropensci/maptools task view. It hasn't seen much activity (until today with a few new packages added). Main reason behind this is that there were some concerns about adding it as an official CRAN Task View due to the overlap between this and the existing Spatial Task View. At this point, it is not clear if the task view has a home (even after significant editing) as a CRAN Task View.
I think there is enough unique about what we have (i.e. links to source repositories and/or links to packages not on CRAN) to keep it around. I am just wondering how best to keep this information and disseminate it. So, if others are looking for something to think on and discuss today or tomorrow, I'd be grateful for the thoughts. Feel free to add to the issue here or just ping me directly.