DataONEorg / mnlite

Light weight read-only DataONE member node in Python Flask
Apache License 2.0
0 stars 0 forks source link

Create simple webserver system for testing #34

Open iannesbitt opened 1 year ago

iannesbitt commented 1 year ago

Related to:

After working with this software for a while, I'm becoming aware that there are many valid site configurations out there that we are unable to navigate due to the limitations of the spider and harvesting system.

Given the above planned features for the spider, it would improve code testing significantly to set up a simple web server with a robots.txt and sitemap.xml at the base that delivers content in some of the ways commonly used by data repositories. For example, being able to test the navigation of javascript elements that render JSON-LD content after the page is loaded (i.e. MagIC DataONEorg/member-repos#16), an application/ld+json delivery system (i.e. Harvard Dataverse DataONEorg/member-repos#52, some valid but alternative configurations of schema.org data (i.e. CanWIN DataONEorg/member-repos#67) and perhaps some misconfigured robots.txt scenarios (i.e. Borealis DataONEorg/member-repos#51), without needing to crawl the repositories themselves.

iannesbitt commented 1 year ago

Also potentially useful for testing #35

iannesbitt commented 1 year ago

Example server tree:

├── metadata
│   ├── CANWIN.jsonld
│   ├── HAKAI_IYS.jsonld
│   ├── HD-301-response.jsonld
│   └── HD-redirect.jsonld
├── robots.txt
└── sitemap.xml
iannesbitt commented 1 year ago

Content negotiation using Django REST framework: https://www.django-rest-framework.org/api-guide/content-negotiation/

iannesbitt commented 10 months ago

Removing label as this is not necessarily related to a version.