rug-compling / Alpino

Alpino parser and related tools for Dutch
GNU Lesser General Public License v2.1
22 stars 2 forks source link

Request to improve software metadata for CLARIAH #12

Closed proycon closed 1 year ago

proycon commented 1 year ago

In CLARIAH we are automatically collecting software metadata for all tools in CLARIAH-PLUS (and CLARIAH-CORE). This metadata is automatically and periodically harvested directly from the source code repositories, and this Alpino git repo is one of those sources. The advantage of this approach is that metadata is as close to the source as possible, reflects the actual software version, and developers retain full authorship and control without needing any middlemen.

The results are published daily on https://tools.clariah.nl/ and this will in turn be queried by other platforms (Ineo, CLARIN VLO) to disseminate the tools in portals for end-users.

The harvesting is set-up in such a way that various existing metadata formats are supported and automatically converted. The whole idea is to burden developers as little as possible, use standards they already use, and prevent any unnecessary duplication of metadata fields. But Alpino is not currently using any scheme, so our harvester doesn't have much to fallback to, and as a consequence the metadata quality is rather poor.

In January a call went out to request all CLARIAH developers to take a look at this metadata and to improve upon it where needed (see https://github.com/CLARIAH/clariah-plus/issues/143). Alpino has a long history in CLARIAH and CLARIN and is much used, so we'd like to have good metadata for it. Could you take a look at improving the metadata? Alpino's results are currently like this.

It would also help a lot if you could use github's release mechanism (i.e. git tags) to tag releases of Alpino (with a semantic version).

Please see the contributing guidelines and the CLARIAH Software Metadata Requirements for in-depth instructions on what metadata to provide and how this can be accomplished.

gertjanvannoord commented 1 year ago

tja, Alpino is geen Clariah of Clarin software... dus dit heeft geen prioriteit vrees ik

On Fri, Mar 3, 2023 at 4:34 PM Maarten van Gompel @.***> wrote:

In CLARIAH we are automatically collecting software metadata for all tools in CLARIAH-PLUS (and CLARIAH-CORE). This metadata is automatically and periodically harvested directly from the source code repositories, and this Alpino git repo is one of those sources. The advantage of this approach is that metadata is as close to the source as possible, reflects the actual software version, and developers retain full authorship and control without needing any middlemen.

The results are published daily on https://tools.clariah.nl/ and this will in turn be queried by other platforms (Ineo, CLARIN VLO) to disseminate the tools in portals for end-users.

The harvesting is set-up in such a way that various existing metadata formats are supported and automatically converted. The whole idea is to burden developers as little as possible, use standards they already use, and prevent any unnecessary duplication of metadata fields. But Alpino is not currently using any scheme, so our harvester doesn't have much to fallback to, and as a consequence the metadata quality is rather poor.

In January a call went out to request all CLARIAH developers to take a look at this metadata and to improve upon it where needed (see CLARIAH/clariah-plus#143 https://github.com/CLARIAH/clariah-plus/issues/143). Alpino has a long history in CLARIAH and CLARIN and is much used, so we'd like to have good metadata for it. Could you take a look at improving the metadata? Alpino's results are currently like this https://tools.clariah.nl/alpino.

It would also help a lot if you could use github's release mechanism (i.e. git tags) to tag releases of Alpino (with a semantic version).

Please see the contributing guidelines https://github.com/CLARIAH/tool-discovery/blob/master/CONTRIBUTING.md and the CLARIAH Software Metadata Requirements https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for in-depth instructions on what metadata to provide and how this can be accomplished.

— Reply to this email directly, view it on GitHub https://github.com/rug-compling/Alpino/issues/12, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADJF4JSAFZVW7JUOB6RMB3W2IFPRANCNFSM6AAAAAAVOYB2XA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

gertjanvannoord commented 1 year ago

inmiddels is er een metadata bestand toegevoegd