Use crawlers' crawlctl for automating metadata extraction from locally staged products

capstone-coal / coal-sds

An Apache OODT-powered Science Data System for COAL

Apache License 2.0

2 stars 3 forks source link

Use crawlers' crawlctl for automating metadata extraction from locally staged products #21

Closed lewismc closed 6 years ago

lewismc commented 6 years ago

As discussed on todays call, once #8 is addressed, we should automate invocation of crawler using the crawlctl. The idea here is for products to be sent to data/staging for them to automatically be detected, for metadata extraction to kick off followed by ingestion into the file manager.

lewismc commented 6 years ago

This file can now be seen at https://github.com/capstone-coal/coal-sds/blob/master/crawler/src/main/resources/bin/crawlctl Essentially we ensure that the Crawler runs as a daemon, checking the local directory every 2 seconds and deleting the original staging products upon successful ingest. Additionally, the products as then archived as well as ingested into the File Manager.