Open willgearty opened 1 year ago
Thanks for the proposal, Will @willgearty, and apologies for the slow response! I've finally had a closer look.
I like the proposal but I'm not fully convinced, yet, that the task view will be sufficiently separated from the existing task views. Relatedly, your process of package selection appears to be somewhat subjective - which we try to avoid in task views by adopting clear inclusion/exclusion criteria. Especially, excluding packages that you feel are too old or that you have no experience with, is too subjective.
Hence, I would ask you to establish sufficiently clear rules for inclusion/exclusion of a package, e.g., that it must be explicitly geared towards paleontology or something like that. And rules that would necessitate some individual review process (e.g., to determine whether a package is "useful" or "finished") should be avoided.
Regarding the maintainers: It's great to see an active community proposing a task view. Seven maintainers might still be feasible but maybe a smaller team would be easier to coordinate? Others could still contribute through issues and PRs. Also, I'm not sure whether the palaeoverse community is already so diverse and heterogeneous so that different palaeological views are reflected in it. Or would it help to bring in maybe one person from the outside as well?
I'm also pinging the principal maintainers of the Spatial, SpatioTemporal, and Environmetrics task views here: @rsbivand, @edzer, @gavinsimpson. Maybe you have some thoughts/ideas as well?
Thanks @zeileis for the helpful comments.
We are certainly open to defining clearer rules for package inclusion/exclusion. I think if we are as exclusive as "explicitly geared towards paleontology", we'll be leaving lots of commonly used packages out (but you are right in that it would then be a very clear rule). However, most, if not all of these excluded packages are already in other task views, so they would at least already be covered there.
We'll give a little time for other folks to provide their thoughts/ideas as well, then we'll look into revising accordingly.
Hi all! I am also unsure but, as I see it, the overlap with Phylogenetics is also non negligeable (but you know the TV better than I do). In short, what is not clear for me is: "do you have in mind at least some core packages that are very specific to Paleontology and not just to other related topics but useful for Paleontology in you list?" My question is probably quite naive (maybe these are clearly listed in your proposal but I am not able to identify them). These are the packages that, somehow, should be put forward in your TV, mentioning packages that have a larger broad but can be useful for the field afterward. But again, my comment might be completely wrong.
My deepest apologies (to my co-maintainers and the CTV editors) for the horrible delay in responding to the feedback here. Despite some reservations, we've decided to go for a more conservative approach, as suggested by @zeileis, that includes only packages that are either explicitly designed for paleontology or are explicitly advertised to paleontologists (it appears this is similar to the approach of the Agriculture CTV, for example).
There are many other packages that paleontologists use as part of their workflows, and so, as part of the development of this CTV, we plan to suggest many of these packages to other CTVs where we believe they will be appropriate. We then plan to link out to these CTVs to ensure that users of the Paleontology CTV can find all of the resources that they may need for their highly interdisciplinary work (see below).
@tuxette there isn't a lot of interpackage dependencies in paleontology, so I wouldn't say any packages really stand out as "core" packages. However, if I had to pick a handful of packages based solely on their breadth of use, I would probably say palaeoverse, paleotree, and paleobioDB, but I'm probably biased. I'd be happy to look into download numbers in the future to identify which packages are most widely used before finalizing the list of "core" packages.
Here is an updated proposal for the Paleontology CTV:
Computational paleontology (or paleobiology) is a thriving field. Gone are the days of just digging up fossils; paleontologists now have the luxury of being able to perform a wide array of complex computational analyses on local and global compendia of fossil occurrence, phylogenetic, and morphological data to study the functional and phylogenetic evolution of organisms, ecosystem function and ecological interactions, paleobiogeographic patterns, and more. Until recently, computational paleontologists have mostly relied on resources designed for evolutionary biologists, ecologists, GISers, and data scientists to accomplish such analyses. However, slowly but surely, resources (including explicit R packages) are being developed to cater to these paleontological tasks.
This CTV brings together the vast majority of paleontological or paleo-adjacent packages that are in use in paleontology. The purpose of this CTV is to provide young and old paleontologists something of a guide to developing a wide variety of computational paleontological workflows. We have included packages (~50 at the moment) that span both the data acquisition/cleaning and analytical components of such workflows, with analyses covering paleoecology, paleobiogeography, phylogenetics, and more (see sections below).
We have excluded many of the most common packages (e.g., tidyverse, sf) because they are often imported by packages in this CTV and they are often covered exhaustively in other CTVs and guides. Further, to keep the list manageable, we also do not include packages that are often used in paleontological workflows but are not explicitly designed for or advertised to paleontologists. Where applicable, we plan to direct users to other CTVs that include many of these packages (and also plan to submit recommendations to these CTVs as necessary).
chronosphere, folio, neotoma2, paleobioDB, rgbif, rgplates, ridigbio, rmacrostrat, rpaleoclim
CoordinateCleaner, fossilbrush, palaeoverse
deeptime, GEOmap, rphylopic, SDAR, StratigrapheR, tidypaleo
analogue, ecospace, fossil, rioja (and Environmetrics CTV)
Compadre, divDyn, divvy, hespdiv, ppgm, sepkoski (and Spatial CTV)
CladeDate, fbdR, FossilSim/FossilSimShiny, paleobuddy, paleotree, RRphylo, strap (and Phylogenetics CTV)
morphospace (and Phylogenetics CTV)
adePEM, astrocron, evoTS, paleoTS, RRatepol (and TimeSeries CTV)
Bchron, cRacle, DAIME, geoChronR, isogeochem, pastclim, sedproxy
Only 10 of the proposed packages are included in other CTVs (rgbif, analogue, rioja, FossilSim, paleobuddy, paleotree, strap, paleoTS, deeptime, and GEOmap).
@zeileis @tuxette Bumping this since the summer is wrapping up. Please let me know what you think of the new proposal!
@willgearty : Sorry, I completely missed your update of June. I took a look at it today and I think that I understand where this goes. For me, this is convincing but @zeileis has a better global view of CTV and possible overlaps so he might have a different opinion. Also, @rsbivand could have interesting additional insights to provide here maybe? A minor remark is that the titles sometimes give the impression that the corresponding section is slightly out of scope. For instance, the generic title "Time series" is very broad, and until we look at the package list, it is not clear that it doesn't overlap with the TimeSeries task view (also, shouldn't deeptime be included in this section?). I’m not sure exactly how to improve it, but I suspect the time series have a particular focus that could perhaps be reflected more precisely in the title.
Thanks @tuxette. That section should probably be titled "Time series analysis" to better reflect that those packages are for analyzing time series, not just visualizing them (this is also why deeptime is not included). I can definitely go back through the headings once the package list is finalized to make sure they are succinct and descriptive.
Will @willgearty, apologies for the late feedback. I agree with Nathalie @tuxette that this goes in the right direction and that the task view is also well-separated from the existing task view topics. I still think that the explanation of the scope needs to be phrased better - but from the current list of packages it's sufficiently clear to me what you want to do. So you can still improve the scope in the next revision.
In short, I endorse this proposal and suggest we let Will and his co-maintainers work out the details. Roger @rsbivand, Dirk @eddelbuettel, Julia @jpiaskowski, and Nathalie @tuxette, if you agree, you can comment below or just react with a thumbs-up.
This looks great (I endorse). You can also list other relevant task views (e.g Time Series) and how they specifically support paleontological applications, but that is your choice.
Thanks for the positive feedback, Julia and Dirk. Together with my endorsement you have the necessary three votes (plus Nathalie was also already very positive). So you can move on and elaborate the entire task view.
Do you want to do that first in your own repository and then transfer it later to the cran-task-views
organization? Or should I already open cran-task-views/Paleontology/
for you? Both is fine with me.
Fantastic news, thank you all for the feedback and support!
I have a draft in progress here: https://github.com/palaeoverse/PaleontologyTaskView. I'm happy to keep using that and then transfer it later.
Our task view draft is now ready for review: https://github.com/palaeoverse/PaleontologyTaskView/blob/main/Paleontology.md. I'd also appreciate feedback from @benmarwick to make sure our two task views remain unique and complementary.
Just poking this issue to see if you all need anything else from me. @tuxette @zeileis @jpiaskowski
For me, everything is OK and Julia and I have approved the last version so we are probably good. The last step is up to @zeileis : I think that he will soon come back to you for the last technical steps (and/or if he has additional comments).
Will @willgearty, this looks great, apologies for not responding sooner. Nathalie and Julia already endorsed and I'm happy to do so, too, so we can move on. I just have two mini requests:
After that you can transfer the repository to me and I can transfer it to the cran-task-views
org and then release it. For the transfer go to
Settings > Collaborators > Public Repository > Manage
and then
Danger Zone > Transfer ownership
to "zeileis".
Scope
Computational paleontology (or paleobiology) is a thriving field. Gone are the days of just digging up fossils; paleontologists now have the luxury of being able to perform a wide array of complex computational analyses on local and global compendia of fossil occurrence, phylogenetic, and morphological data to study the functional and phylogenetic evolution of organisms, ecosystem function and ecological interactions, paleobiogeographic patterns, and more. Until recently, computational paleontologists have mostly relied on resources designed for evolutionary biologists, ecologists, GISers, and data scientists to accomplish such analyses. However, slowly but surely, resources (including explicit R packages) are being developed to cater to these paleontological tasks.
This CTV brings together a) a collection of traditional packages that are often seen in use in standard computational paleontological workflows, b) more recent paleontological or paleo-adjacent packages that are commonly in use in paleontology, and c) cutting edge paleo-explicit packages that we believe should be adopted by the paleontological community. Therefore, the purpose of this CTV is to provide young and old paleontologists something of a guide to developing a wide variety of computational paleontological workflows. We have included packages (~50 at the moment) that span both the data acquisition/cleaning and analytical components of such workflows, with analyses covering paleoecology, paleobiogeography, phylogenetics, and more (see sections below).
We have excluded many of the most common packages (e.g., tidyverse, sf) because they are often imported by packages in this CTV and they are often covered exhaustively in other CTVs and guides. Further, we have excluded older packages that have been superseded by more robust and/or featureful newer packages (e.g., there are a ~million packages related to ENM, but we have only included a handful). We also recognize that there are many other packages out there that are relevant to or explicitly for paleontology (we originally built a list of ~140 packages that we whittled down to the list below). We excluded most of these packages because we, as a group, had little experience with them or because the packages seemed unfinished or too niche to be useful. However, we'd love to hear from anyone that might have suggestions about other packages to include/exclude. Finally, where applicable, we plan to direct users to other CTVs that overlap in scope (see below).
Packages
Data acquisition
mapast, neotuma2, paleobioDB, rgbif, rgplates, ridigbio, chronosphere
Data cleaning
CoordinateCleaner, fossilbrush, palaeoverse
Data visualization
deeptime, ggtern, ggtree, SDAR, StratigrapheR, tidypaleo, geoChronR, rphylopic
Paleoecology
ade4, dismo, ecospace, ENMeval, ENMTools, fossil, fundiversity, vegan
Paleobiogeography and biodiversity
BAT, Compadre, divDyn, divvy, iNext, sepkoski
Phylogenetics
caper, diversitree, fbdR, FossilSim, geiger, mvMORPH, paleobuddy, paleotree, phytools, strap
Morphology
geomorph, Claddis, dispRity, morphospace
Time series
paleoTS, evoTS, layeranalyzer
Overlap
There is considerable overlap of the scope of this proposed CTV with the scope of other CTVs, including Environmetrics, Phylogenetics, TimeSeries, and Spatial. This stems from the fact that this proposed CTV is subject-oriented, rather than methodology-oriented. This doesn't appear to be an exception, though, given there are already CTVs on other subjects (e.g., ChemPhys). Further, this CTV is focused on which packages in these other CTVs may be used specifically within computational paleontological workflows.
Maintainers
Principal maintainer: @willgearty (also the principal maintainer of the Phylogenetics CTV) Co-maintainers: @AlfioAlessandroChiarenza, @bethany-j-allen, @ChristopherDavidDean, @KEichenseer, @LewisAJones, and @pedrolgodoy (this is a @palaeoverse project)