DataONEorg / mnlite

Light weight read-only DataONE member node in Python Flask
Apache License 2.0
0 stars 0 forks source link

New repo robots.txt, sitemap.xml, and dataset landing json-ld reporting #21

Open iannesbitt opened 1 year ago

iannesbitt commented 1 year ago

@mbjones has suggested that it would be beneficial to the onboarding process to have a way to produce a report stating the existence and perhaps a simplified version of the contents of a new repository's robots.txt, sitemap.xml, and a dataset landing page to extract json-ld. This will be a quick way of establishing how ready the repo is for schema.org harvesting.

To be decided is whether this will work best as an independent script or a set of functions encompassed under mnonboard which can be accessed by an independent script.

mbjones commented 1 year ago

Ideally what I would like is for that script to be both callable from the command line and deployable as a web service. Fir example:

$ ./so-report.py --profile dataone-full "https://arcticdata.io/catalog/view/doi%3A10.18739%2FA2SB3X09D"
# OR as a web service call:
$ curl -A "Accept: text/csv" https://api.dataone.org/so-report/dataone-full/https$3A%2F%2Farcticdata.io%2Fcatalog%2Fview%2Fdoi%3A10.18739%2FA2SB3X09D

The intent of the "profile" parameter is to select different shacl profiles (name without the .ttl extension). Obviously needs more thought, especially the second form, which would presumably also have a default text/html option for returning a report.

iannesbitt commented 1 year ago

This is a great idea. Would require me reworking part of the onboarding script but probably worth it. I'm not as familiar with the process of turning it into a web service, but I assume it wouldn't be too difficult given a well defined working command line tool.

One question I have: is there a place where we've collected the profiles we'd be testing against here? I just have the one (soso 1.2.3) so far.

mbjones commented 1 year ago

That soso 1.2.3 is the main one we have settled on, but we;ve discussed having others, and there are examples of others in the same directory as the soso1.2.3.

Getting your script set up as a web service should be straightforward if you have everything encapsulated in well-defined functions, and don't have any logic in main() or the command-line parsing functions for the CLI.