CivicSpleen / ambry

A comprehensive data package manager
BSD 2-Clause "Simplified" License
4 stars 5 forks source link

cde.ca.gov-schools.yaml import error: SourceError: Fixed width source must have a schema defined, with column widths. #113

Closed nmb10 closed 8 years ago

nmb10 commented 8 years ago

run:

python load_pre10.py ../../pre-10-bundles/converted/cde.ca.gov-schools/cde.ca.gov-schools.yaml

error:

Starting import ../../pre-10-bundles/converted/cde.ca.gov-schools/cde.ca.gov-schools.yaml...
Loading bundle: cde.ca.gov-schools-0.0.3~d03L003
INFO cde.ca.gov-schools ---- Synchronized ----
Starting ingest...
INFO cde.ca.gov-schools Ingesting: schools from ftp://ftp.cde.ca.gov/demo/schlname/pubschls.txt
INFO cde.ca.gov-schools Downloading ftp://ftp.cde.ca.gov/demo/schlname/pubschls.txt
INFO cde.ca.gov-schools Downloading ftp://ftp.cde.ca.gov/demo/schlname/pubschls.txt
INFO cde.ca.gov-schools Downloading ftp://ftp.cde.ca.gov/demo/schlname/pubschls.txt
Traceback (most recent call last):
  File "load_pre10.py", line 251, in <module>
    main()
  File "load_pre10.py", line 220, in main
    _ingest(b)
  File "load_pre10.py", line 177, in _ingest
    b.ingest(force=force, clean_files=clean_files)
  File "/home/nmb10/projects/ambry_project/ambry/bundle/bundle.py", line 902, in ingest
    source.datafile.load_rows(s, s.spec)
  File "/home/nmb10/.virtualenvs/ambry/local/lib/python2.7/site-packages/ambry_sources/mpf.py", line 429, in load_rows
    intuit_type=intuit_type, run_stats=run_stats)
  File "/home/nmb10/.virtualenvs/ambry/local/lib/python2.7/site-packages/ambry_sources/mpf.py", line 459, in _load_rows
    w.load_rows(source)
  File "/home/nmb10/.virtualenvs/ambry/local/lib/python2.7/site-packages/ambry_sources/mpf.py", line 718, in load_rows
    for row in iter(source):
  File "/home/nmb10/.virtualenvs/ambry/local/lib/python2.7/site-packages/ambry_sources/sources/accessors.py", line 177, in __iter__
    parser = self.make_fw_row_parser()
  File "/home/nmb10/.virtualenvs/ambry/local/lib/python2.7/site-packages/ambry_sources/sources/accessors.py", line 151, in make_fw_row_parser
    raise SourceError('Fixed width source must have a schema defined, with column widths.')
ambry_sources.sources.exceptions.SourceError: Fixed width source must have a schema defined, with column widths.
ericbusboom commented 8 years ago

In this case, the correct result is to set an error ( 'ingest_error') and exit with an exception. There is a possibility of handling this case automatically, by copying the schema, with the position and width information for the fixed width file into a source_schema, but it is a rare case, so it would be better to do manually.