CLARIAH / clariah-plus

This is the project planning repository for the CLARIAH-PLUS project. It groups all technical documents and discussions pertaining to CLARIAH-PLUS in a central place and should facilitate findability, transparency and project planning, for the project as a whole.
9 stars 6 forks source link

Implement harvester component for tool discovery #33

Closed proycon closed 2 years ago

proycon commented 2 years ago

The component is defined in https://github.com/CLARIAH/clariah-plus/blob/main/technical-committee/shared-development-roadmap/epics/shared/fair-tool-discovery.md as follows:

Harvester for software & service metadata. Periodically queries all endpoints listed in the CLARIAH Tool Source Registry, converts metadata to a common scheme, and finally updates the tool store. Endpoints may be git source repositories from which metadata is extracted, or service endpoints that explicitly provided metadata.

I would advocate for a simple approach for the harvester, and relying on codemetapy and other tools to do the necessary conversions. All details still have to be worked out.

Depends on implementation of a data component: Tool Source Registry , currently described as follows:

Simple registry of software source repositories and service endpoints. Serves as input for the harvester.

This source registry could be simply be implemented as a simple plain text list of URLs in a git repository on github, new registrations can be added using pull requests. Or implemented using the planned baserow database that holds all software components.

proycon commented 2 years ago

This harvester component is being implemented in https://github.com/proycon/codemeta-harvester