WPRDC / wprdc-etl

MIT License
8 stars 3 forks source link

Add connector base #35

Closed bsmithgall closed 8 years ago

saylorsd commented 8 years ago

@bsmithgall validate_input() is breaking with the police blotter pipeline. It throws a UnsupportedOperation exception stating the 'underlying stream is not seekable'. Do you think this has to do with the fact that the file is streamed in via http?

bsmithgall commented 8 years ago

Can you post the full traceback?

saylorsd commented 8 years ago
Traceback (most recent call last):
  File "C:\documents\wprdc\wprdc-etl\pipeline\pipeline.py", line 226, in run
    input_checksum = self.validate_input(_connector)
  File "C:\documents\wprdc\wprdc-etl\pipeline\pipeline.py", line 186, in validate_input
    input_checksum = connection.checksum_contents()
  File "C:\documents\wprdc\wprdc-etl\pipeline\connectors.py", line 37, in checksum_contents
    self._file.seek(0)
io.UnsupportedOperation: underlying stream is not seekable
bsmithgall commented 8 years ago

Working on a full fix for this.

saylorsd commented 8 years ago

@bsmithgall Sounds good. I made a few small fixes unrelated to the file stream issue.

One was some typos I had in police.blotter.py, the other removes and I also set up the fatal od pipeline with the necessary to interface with the WPRDC. Also, the serialize_to_ckan_fields() wasn't returning anything. I just committed them here: c80520aaa9b1a164039e5deb8ca13d753b604c91. If you want to integrate those changes into your current branch, go ahead.

bsmithgall commented 8 years ago

Merged and pushed!