ocean-tracking-network / rt-sat-to-obis

A pilot for delivering real-time satellite telemetry data to OBIS from a few of the popular manufacturers' data portals, through QC to improve upon their position
1 stars 1 forks source link

refine source definition file format #2

Open jdpye opened 1 week ago

jdpye commented 1 week ago

I'm thinking of a script that can handle multiple data sources, so I'm looking at using YAML files to define what each data source is, and where their source fields are. Then, each source file would have URLs to the appropriate source data for telemetry (here meaning 'location data from the instrument'), tag attachment to individuals, and project-level metadata.

There's probably a lot I'm missing but I think this is a suitable skeleton to start from! See the strawman in https://github.com/ocean-tracking-network/rt-sat-to-obis/blob/main/sources/imos_ct180.yml

MathewBiddle commented 1 week ago

Would it be of interest to use Frictionless Data Packages (FDP) here?

FDP spec: https://specs.frictionlessdata.io/data-package/ GitHub repos: https://github.com/frictionlessdata

There is also the Library of Congress BagIt specification: https://blogs.loc.gov/thesignal/2019/04/bagit-at-the-library-of-congress/ GitHub repos: https://github.com/orgs/LibraryOfCongress/repositories?type=all&q=bagit

jdpye commented 1 week ago

I'm not sure a FDP or other hard-standard for a between-products data object is useful, but maybe at the point that you foresee a fork in the road, where an intermediary data product we are making is intended to be used in creating multiple data products downstream, maybe then it becomes more useful? Pitch me on it if so, let me know what is helpful about it.

MathewBiddle commented 1 week ago

Makes sense. Just trying to not invent a new standard if others already exist. YAML is widely used and easy to work with, so lets see what we get.