capstone-coal / coal-sds

An Apache OODT-powered Science Data System for COAL
Apache License 2.0
2 stars 3 forks source link

Use crawlers' crawlctl for automating metadata extraction from locally staged products #21

Closed lewismc closed 5 years ago

lewismc commented 5 years ago

As discussed on todays call, once #8 is addressed, we should automate invocation of crawler using the crawlctl. The idea here is for products to be sent to data/staging for them to automatically be detected, for metadata extraction to kick off followed by ingestion into the file manager.

lewismc commented 5 years ago

This file can now be seen at https://github.com/capstone-coal/coal-sds/blob/master/crawler/src/main/resources/bin/crawlctl Essentially we ensure that the Crawler runs as a daemon, checking the local directory every 2 seconds and deleting the original staging products upon successful ingest. Additionally, the products as then archived as well as ingested into the File Manager.