nilmtk / nilm_metadata

A schema for modelling meters, measurements, appliances, buildings etc
http://nilm-metadata.readthedocs.org
Apache License 2.0
49 stars 47 forks source link

yaml error calling convert_yaml_to_hdf5 #21

Closed gjwo closed 9 years ago

gjwo commented 9 years ago

Can anybody tell me what is wrong with this section of yaml (starts at line 549 of building.yaml), 559 is the components line, error traceback given below from a call to convert_yaml_to_hdf5 I do know appliances don't yet contain a cassette deck but I don't think it got that far.

- original_name: HiFi
  description: Teac compact H500 separates, output 50 watts per channel into 8Ω (stereo),
    Total harmonic distortion 0.03%, Input sensitivity 2.8mV (MM), 180mV (line),
    Signal to noise ratio 67dB (MM), 95dB (line), Channel separation 65dB (line),
    Speaker load impedance 4Ω to 16Ω, Dimensions 285 x 131 x 319mm, Weight 7kg
  manufacturer: Teac
  brand: Reference 500
  type: audio system
  room: lounge
  meters: [1]
  components:
  - {type: CD player, model: PD-H500i, nominal_consumption: 40}
  - {type: audio amplifier, model: A-H500i, nominal_consumption: 500}
  - {type: radio, subtype: analogue, model: T-H500, nominal_consumption: 40}
  - {type: cassette deck, model: T-H500, nominal_consumption: 40}
  year_of_purchase: 1990 #approx
  dates_active: {start:  1990-02-01}
ParserError                               Traceback (most recent call last)
<ipython-input-1-352366a75b30> in <module>()
    132     df = df.sort_index()
    133     return df
--> 134 convert_gjw('C:/Users/GJWood/nilm_gjw_data',None)

<ipython-input-1-352366a75b30> in convert_gjw(gjw_path, output_filename, format)
    109             break # only 1 folder with .csv files at present
    110     store.close()
--> 111     convert_yaml_to_hdf5(join(gjw_path, 'metadata'),output_filename)
    112     print("Done converting gjw to HDF5!")
    113 

c:\users\gjwood\nilm_metadata\nilm_metadata\convert_yaml_to_hdf5.pyc in convert_yaml_to_hdf5(yaml_dir, hdf_filename)
     48         except:
     49             group = store._handle.get_node('/' + building)
---> 50         building_metadata = _load_file(yaml_dir, fname)
     51         elec_meters = building_metadata['elec_meters']
     52         _deep_copy_meters(elec_meters)

c:\users\gjwood\nilm_metadata\nilm_metadata\convert_yaml_to_hdf5.pyc in _load_file(yaml_dir, yaml_filename)
    100     if isfile(yaml_full_filename):
    101         with open(yaml_full_filename) as fh:
--> 102             return yaml.load(fh)
    103     else:
    104         print(yaml_full_filename, "not found.", file=stderr)

c:\Users\GJWood\Anaconda\lib\site-packages\yaml\__init__.pyc in load(stream, Loader)
     69     loader = Loader(stream)
     70     try:
---> 71         return loader.get_single_data()
     72     finally:
     73         loader.dispose()

c:\Users\GJWood\Anaconda\lib\site-packages\yaml\constructor.pyc in get_single_data(self)
     35     def get_single_data(self):
     36         # Ensure that the stream contains a single document and construct it.
---> 37         node = self.get_single_node()
     38         if node is not None:
     39             return self.construct_document(node)

c:\Users\GJWood\Anaconda\lib\site-packages\yaml\composer.pyc in get_single_node(self)
     34         document = None
     35         if not self.check_event(StreamEndEvent):
---> 36             document = self.compose_document()
     37 
     38         # Ensure that the stream contains no more documents.

c:\Users\GJWood\Anaconda\lib\site-packages\yaml\composer.pyc in compose_document(self)
     53 
     54         # Compose the root node.
---> 55         node = self.compose_node(None, None)
     56 
     57         # Drop the DOCUMENT-END event.

c:\Users\GJWood\Anaconda\lib\site-packages\yaml\composer.pyc in compose_node(self, parent, index)
     82             node = self.compose_sequence_node(anchor)
     83         elif self.check_event(MappingStartEvent):
---> 84             node = self.compose_mapping_node(anchor)
     85         self.ascend_resolver()
     86         return node

c:\Users\GJWood\Anaconda\lib\site-packages\yaml\composer.pyc in compose_mapping_node(self, anchor)
    131             #    raise ComposerError("while composing a mapping", start_event.start_mark,
    132             #            "found duplicate key", key_event.start_mark)
--> 133             item_value = self.compose_node(node, item_key)
    134             #node.value[item_key] = item_value
    135             node.value.append((item_key, item_value))

c:\Users\GJWood\Anaconda\lib\site-packages\yaml\composer.pyc in compose_node(self, parent, index)
     80             node = self.compose_scalar_node(anchor)
     81         elif self.check_event(SequenceStartEvent):
---> 82             node = self.compose_sequence_node(anchor)
     83         elif self.check_event(MappingStartEvent):
     84             node = self.compose_mapping_node(anchor)

c:\Users\GJWood\Anaconda\lib\site-packages\yaml\composer.pyc in compose_sequence_node(self, anchor)
    109         index = 0
    110         while not self.check_event(SequenceEndEvent):
--> 111             node.value.append(self.compose_node(node, index))
    112             index += 1
    113         end_event = self.get_event()

c:\Users\GJWood\Anaconda\lib\site-packages\yaml\composer.pyc in compose_node(self, parent, index)
     82             node = self.compose_sequence_node(anchor)
     83         elif self.check_event(MappingStartEvent):
---> 84             node = self.compose_mapping_node(anchor)
     85         self.ascend_resolver()
     86         return node

c:\Users\GJWood\Anaconda\lib\site-packages\yaml\composer.pyc in compose_mapping_node(self, anchor)
    125         if anchor is not None:
    126             self.anchors[anchor] = node
--> 127         while not self.check_event(MappingEndEvent):
    128             #key_event = self.peek_event()
    129             item_key = self.compose_node(node, None)

c:\Users\GJWood\Anaconda\lib\site-packages\yaml\parser.pyc in check_event(self, *choices)
     96         if self.current_event is None:
     97             if self.state:
---> 98                 self.current_event = self.state()
     99         if self.current_event is not None:
    100             if not choices:

c:\Users\GJWood\Anaconda\lib\site-packages\yaml\parser.pyc in parse_block_mapping_key(self)
    437             token = self.peek_token()
    438             raise ParserError("while parsing a block mapping", self.marks[-1],
--> 439                     "expected <block end>, but found %r" % token.id, token.start_mark)
    440         token = self.get_token()
    441         event = MappingEndEvent(token.start_mark, token.end_mark)

ParserError: while parsing a block mapping
  in "C:\Users\GJWood\nilm_gjw_data\metadata\building1.yaml", line 549, column 3
expected <block end>, but found '-'
  in "C:\Users\GJWood\nilm_gjw_data\metadata\building1.yaml", line 559, column 3
gjwo commented 9 years ago

Any clues on how to find instance errors in a 700+ line metadata file? :confounded:

KeyError                                  Traceback (most recent call last)
<ipython-input-1-352366a75b30> in <module>()
    132     df = df.sort_index()
    133     return df
--> 134 convert_gjw('C:/Users/GJWood/nilm_gjw_data',None)

<ipython-input-1-352366a75b30> in convert_gjw(gjw_path, output_filename, format)
    109             break # only 1 folder with .csv files at present
    110     store.close()
--> 111     convert_yaml_to_hdf5(join(gjw_path, 'metadata'),output_filename)
    112     print("Done converting gjw to HDF5!")
    113 

c:\users\gjwood\nilm_metadata\nilm_metadata\convert_yaml_to_hdf5.pyc in convert_yaml_to_hdf5(yaml_dir, hdf_filename)
     53         _set_data_location(elec_meters, building)
     54         _sanity_check_meters(elec_meters, meter_devices)
---> 55         _sanity_check_appliances(building_metadata)
     56         group._f_setattr('metadata', building_metadata)
     57 

c:\users\gjwood\nilm_metadata\nilm_metadata\convert_yaml_to_hdf5.pyc in _sanity_check_appliances(building_metadata)
    170         appl_type = appliance['type']
    171         instances = appliance_instances.setdefault(appl_type, [])
--> 172         instances.append(appliance['instance'])
    173 
    174     for appliance_type, instances in appliance_instances.iteritems():

KeyError: 'instance'
nipunbatra commented 9 years ago

Can you validate your YAML using YAMLlint?

gjwo commented 9 years ago

Thanks Nipun I will try that, but I am past the syntax errors now, and the error above looks like it is in the sanity check code, I looked at the code, and the next section would have output some clues, but this looks like something wasn't initialised. Is there something I should have called prior to the converter code?

gjwo commented 9 years ago

YAMLint - Valid YAML!

gjwo commented 9 years ago

Here is the current code and errors in an iPython Notebook https://github.com/gjwo/nilm_gjw_data/blob/master/gjw_converter_test.ipynb I could do with some help on this, it does look like the toolkit metadata code failing

gjwo commented 9 years ago

OK I have found what the problem was by putting print(appliance) at line 172 of convert_yaml_to_hd5.py. The issue was I was missing an "instance: 1" in several places, this caused a KeyError exception which did not have a handler at line 173.

I seem to recall reading that if there was only one instance of a type then the instance key was optional, clearly this is not the case, doh!

@JackKelly @nipunreddevil I think making clear which keys are mandatory in the documentation, and adding an exception handler which prints the device where the checking failed would help those who follow

JackKelly commented 9 years ago

I seem to recall reading that if there was only one instance of a type then the instance key was optional

Do you remember where you read that? As you say, that is wrong and so we should change the docs if necessary. Perhaps our various discussions about simplifying the schema are causing confusion here? Or perhaps NILMTK's behaviour is causing confusion here: in NILMTK you can do elec['fridge'] and you'll get fridge 1.

I think making clear which keys are mandatory in the documentation

We do try to do that. The required keys have '(required)' written next to them in the documentation (for example, here's the Appliance docs). Perhaps '(required)' should be in bold or a different colour or something?

adding an exception handler which prints the device where the checking failed would help those who follow

Absolutely right. I'll do that now and report back when I'm done. Sorry the code is not very friendly in (lots of) places ;)

JackKelly commented 9 years ago

OK, I have added more sanity checks. And added some simple unit tests for these sanity checks. I'll close this issue for now. Please re-open if necessary.