Biodiversity Virtual Research Environments - working towards a common roadmap

PaulKiddle commented 4 years ago

The Biodiversity Next symposium "Biodiversity Virtual Research Environments: past, present & possible futures" explored avenues of cooperation and coordination for VRE systems.

There was a degree of consensus in the direction of many of these VREs, particularly with many exploring microservice-based frameworks. This could provide a point of collaboration between the systems.

This issue has been created to build upon this consensus and discuss plans for how these systems might be aligned. It is intended to be an open discussion for anyone working with or interested in VREs, so even if you weren't at Biodiversity Next, please feel free to get involved.

A rough roadmap:

VRE Gap Analysis

A good starting point is to understand the extent of overlap in data models and functionality between different VREs. The Natural History Museum London will undertake a gap analysis of current VRE systems, including (so far - please suggest others in the comments below):

Scratchpads
TaxonWorks
PlutoF
DINA
EarthCape
Xper3

Once completed, we will publish this gap analysis - and updates/results will be posted here. If anyone would like to be involved in this paper please comment/contact us.

Common data model

If the gap analysis identifies a level of homogeneity across these systems, this could help inform a common data model and functional requirements for shared services.

Alongside this, we need to know which data standards exist to help inform the model.

GBIF / EML
EDIT CDM

Existing frameworks

Are there existing VRE/data frameworks to implement this common data model/service layer? We do not want to reinvent the wheel, and the more abstracted the system - one not limited to biodiversity data - is more resilient to change. Potentially of interest:

GO FAIR Implementation Network BiodiFAIRse
VRE4EIC
SOLID
FAIR.ReD
WikiData

What others should be included?

Other initiatives

This initiative has a large overlap with the work of Research Data Alliance VRE Interest Group. At the very least, we need to keep an eye on their outputs and recommendations and probably reach out to them once we have a clearer idea of a plan.

PaulKiddle commented 4 years ago

I'm returning to look at this project again, and I've had some thoughts about the direction we might take.

I think the outcome of this process should be:

High-level guidelines for VRE development that aim to reduce developer effort and increase maintainability
Recommended libraries (or other resources) for adding specific functionality for a given programming language - to encourage sharing and community maintenance of common code
A group to continually develop/maintain these resources, and to promote their use

There has been some talk of recommending a common framework/platform. I think anticipating all of the different requirements for all VREs would be an impossible task, and could lead to software bloat (this is one of the reasons Scratchpads has become difficult to maintain).

Instead, issuing guidance and recommendations allows developers to create a unique platform while benefiting from shared knowledge, and allows the various VRE platforms and services to integrate easily with each other.

Finally, it could be very easy to put a lot of work into creating a resource that simply never gets used. An advocacy group seems to me the best way to make sure what we produce actually gets put into action.

I'm really keen to get some input from others on this, does this sound like a good approach? Are there drawbacks?

mjy commented 4 years ago

I think you've laid out the issues really nicely, thanks for this. I think, like you mention, that this all depends on developers making conscious commitments to this approach. I think I can vouch for our group and say we'd be on board to commit time and effort in some capacity, but we'd like need to figure out a minimum buy-in across the board to push forward.

As you noted- part of what we discussed is to err on the side of light-weight general help rather than extensive documentation and guidelines. Of course we need both at different levels, and we've got a bit of a chicken-egg problem too, so this is a tough one.

tdwg / developers