terraref / computing-pipeline

Pipeline to Extract Plant Phenotypes from Reference Data
BSD 3-Clause "New" or "Revised" License
24 stars 13 forks source link

What data goes to geodashboard, clowder, betydb? put this into extractors #255

Closed ghost closed 7 years ago

ghost commented 7 years ago

@robkooper and @max-zilla add info here

max-zilla commented 7 years ago

this should be defined for each extractor - inputs/ouputs and where metadata/outputs are written. all feeds into overall picture of interoperability of extractors and algorithms.

robkooper commented 7 years ago

I think it be good to have a document somewhere where we list all extractors, with inputs as well as outputs. Some of this information might be in extractor_info.json.

robkooper commented 7 years ago

As stated on Thursday, we should not be scared about duplicating information in different places if that makes it easier for a user to find information.

max-zilla commented 7 years ago

@robkooper that's the document I intend to create as a result of this. Want to build on: https://docs.google.com/spreadsheets/d/1LLiQSFHbEWoo_FkvG1nl-XoSq_lZ_o6VazfxmPfOH9Y/edit#gid=0

max-zilla commented 7 years ago

Just added a 2nd sheet to that Google Doc with input/output columns and some restructuring.

Advancing this alongside refactoring PyClowder2 geostreams code in extractors to avoid code duplication.