nilmtk / nilm_metadata

A schema for modelling meters, measurements, appliances, buildings etc
http://nilm-metadata.readthedocs.org
Apache License 2.0
49 stars 47 forks source link

Reduce coupling with HDF5 #14

Closed oliparson closed 9 years ago

oliparson commented 9 years ago

It seems like the metadata project has a close coupling with the HDF5 format. Shouldn't it be agnostic of the data format, and just deal with a NILMTK DataStore? Each DataStore has a save_metadata() function which should deal with writing metadata, rather than requiring a convert_yamlto function for each type of DataStore.

JackKelly commented 9 years ago

I agree that NILM Metadata should be agnostic of the data format.

The main 'coupling' to HDF5 that I'm aware of is that the docs defining NILM Metadata specify where each metadata object is stored in HDF5 and in YAML. We have to define this somewhere so we might as well define it in the NILM Metadata docs.

The Python scripts which come with NILM Metadata aren't really intended to be part of the metadata spec as such. They're just helpful utilities. Feel free to re-use them or define your own.

I assume you're asking about this stuff because you're working on the CSVDataStore class? You're absolutely right that we shouldn't have a convert_yaml_to_CSVDataStore function. My hunch was that we would describe the CSV data using just a single directory of YAML files, exactly as defined in NILM Metadata.

Taking a step back: file formats can be separated into those which can store metadata in the same file as the data (e.g. HDF5) and those which cannot include metadata with the data (e.g. 'vanilla' CSV). In the latter case, I imaged that the metadata would always be stored in YAML files as defined by the NILM Metadata project. So no conversion from YAML to another on-disk format is required, because we natively use YAML with the CSV files. Does that sound sensible?! Or have I missed something??

JackKelly commented 9 years ago

@oliparson have your recent changes resolved this issue?

oliparson commented 9 years ago

Yep I think this is sorted. There's still strong coupling between most of the dataset_converers and the HDFDataStore, but I think that's a separate issue.