CivicSpleen / ambry

A comprehensive data package manager
BSD 2-Clause "Simplified" License
4 stars 5 forks source link

medicare.gov-dfcd.yaml import error: KeyError: u'unknown' #124

Closed nmb10 closed 8 years ago

nmb10 commented 8 years ago

run:

python load_pre10.py ../../pre-10-bundles/converted/medicare.gov-dfcd/medicare.gov-dfcd.yaml 

error:

Starting import ../../pre-10-bundles/converted/medicare.gov-dfcd/medicare.gov-dfcd.yaml...
Loading bundle: medicare.gov-dfcd-0.0.1~d04c001
INFO medicare.gov-dfcd ---- Synchronized ----
Starting ingest...
INFO medicare.gov-dfcd Ingesting: fac_data from http://data.medicare.gov/views/bg9k-emty/files/qUrdPCwgdv-H4tocW1m_6PtReuPSzyJHioinJ0_UkdI?filename=DFCompare_Revised_FlatFiles.zip
INFO medicare.gov-dfcd Ingesting fac_data: intuit_type 5443 of 6140, rate: 626.11
INFO medicare.gov-dfcd Ingesting fac_data: run_stats 3238 of 6140, rate: 394.1
INFO medicare.gov-dfcd Ingested: fac_data.mpr
INFO medicare.gov-dfcd Ingesting: national_data from http://data.medicare.gov/views/bg9k-emty/files/qUrdPCwgdv-H4tocW1m_6PtReuPSzyJHioinJ0_UkdI?filename=DFCompare_Revised_FlatFiles.zip
INFO medicare.gov-dfcd Ingesting national_data: intuit_type 5432 of 6140, rate: 622.47
INFO medicare.gov-dfcd Ingesting national_data: run_stats 3109 of 6140, rate: 379.64
INFO medicare.gov-dfcd Ingested: national_data.mpr
INFO medicare.gov-dfcd Ingesting: qip_data from http://data.medicare.gov/views/bg9k-emty/files/qUrdPCwgdv-H4tocW1m_6PtReuPSzyJHioinJ0_UkdI?filename=DFCompare_Revised_FlatFiles.zip
INFO medicare.gov-dfcd Ingesting qip_data: intuit_type 5023 of 6140, rate: 581.88
INFO medicare.gov-dfcd Ingesting qip_data: run_stats 2951 of 6140, rate: 395.01
INFO medicare.gov-dfcd Ingested: qip_data.mpr
INFO medicare.gov-dfcd Ingesting: state_data from http://data.medicare.gov/views/bg9k-emty/files/qUrdPCwgdv-H4tocW1m_6PtReuPSzyJHioinJ0_UkdI?filename=DFCompare_Revised_FlatFiles.zip
INFO medicare.gov-dfcd Ingesting state_data: intuit_type 5253 of 6140, rate: 605.56
INFO medicare.gov-dfcd Ingesting state_data: run_stats 3072 of 6140, rate: 388.56
INFO medicare.gov-dfcd Ingested: state_data.mpr
Starting schema...
INFO medicare.gov-dfcd Populating table: qip_data
INFO medicare.gov-dfcd Populating table: state_data
INFO medicare.gov-dfcd Populating table: national_data
INFO medicare.gov-dfcd Populating table: fac_data
Starting build...
INFO medicare.gov-dfcd ---- Phase: build ---
INFO medicare.gov-dfcd Processing 4 sources, stage main ; [u'fac_data', u'national_data', u'qip_data', u'state_data']
INFO medicare.gov-dfcd Running phase build for source fac_data with pipeline build
Traceback (most recent call last):
  File "load_pre10.py", line 262, in <module>
    main()
  File "load_pre10.py", line 221, in main
    _build(b)
  File "load_pre10.py", line 198, in _build
    b.build(force=force)
  File "/home/nmb10/projects/ambry_project/ambry/bundle/bundle.py", line 1378, in build
    return self.run_phase('build', sources=sources, stage=stage, force=force)
  File "/home/nmb10/projects/ambry_project/ambry/bundle/bundle.py", line 1292, in run_phase
    self.phase_main(phase, stage=stage, sources=sources)
  File "/home/nmb10/projects/ambry_project/ambry/bundle/bundle.py", line 1242, in phase_main
    pl.run(count=rows_count)
  File "/home/nmb10/projects/ambry_project/ambry/etl/pipeline.py", line 1782, in run
    self.sink.run()
  File "/home/nmb10/projects/ambry_project/ambry/etl/pipeline.py", line 285, in run
    for i, row in enumerate(self._source_pipe):
  File "/home/nmb10/projects/ambry_project/ambry/etl/pipeline.py", line 139, in __iter__
    self.headers = self.process_header(next(rg))
  File "/home/nmb10/projects/ambry_project/ambry/etl/pipeline.py", line 139, in __iter__
    self.headers = self.process_header(next(rg))
  File "/home/nmb10/projects/ambry_project/ambry/etl/pipeline.py", line 139, in __iter__
    self.headers = self.process_header(next(rg))
  File "/home/nmb10/projects/ambry_project/ambry/etl/caster.py", line 237, in process_header
    type_f = c.valuetype_class
  File "/home/nmb10/projects/ambry_project/ambry/orm/column.py", line 174, in valuetype_class
    return self.types[self.datatype][1]
KeyError: u'unknown'