palewire / django-calaccess-raw-data

A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database
http://django-calaccess.californiacivicdata.org/
MIT License
64 stars 143 forks source link

ReceivedFilingsCd.filing_file_name char length too short #1490

Closed gordonje closed 8 years ago

gordonje commented 8 years ago

This morning's update (8/26/2016) stalled when attempting to load the ReceivedFilingsCd model because the max length of the filing_file_name field is set to 14 and the original TSV file had single row (line number 405,103) with a 15 char length value: '1921655_100.cal'.

Here is the full row, as parsed by csv.DictReader:

{
    'FILING_DIRECTORY': '/cafiler/disclose/ca99/cal',
    'FILING_ID': '1921655',
    'FORM_ID': 'F601',
    'RECEIVE_COMMENT': 'Filing accepted',
    'RECEIVED_DATE': '8/25/2016 10:09:03 AM',
    'FILING_FILE_NAME': '1921655_100.cal',
    'FILER_ID': '1334301'
}

Checking the official docs, looks there are several char fields on this table that are way below the prescribed maximum length. Not sure why we would have set them any lower than what the official docs say...

Gonna bump all this up, release a patch, and try resuming the update.

palewire commented 8 years ago

Good call. Our oversight when doing the models I suspect.

On Fri, Aug 26, 2016, 7:47 AM James Gordon notifications@github.com wrote:

This morning's update (8/26/2016) stalled when attempting to load the ReceivedFilingsCd model because the max length of the filing_file_name field is set to 14 and the original TSV file had single row (line number 405,103) with a 15 char length value: '1921655_100.cal'.

Here is the full row, as parsed by csv.DictReader:

{ 'FILING_DIRECTORY': '/cafiler/disclose/ca99/cal', 'FILING_ID': '1921655', 'FORM_ID': 'F601', 'RECEIVE_COMMENT': 'Filing accepted', 'RECEIVED_DATE': '8/25/2016 10:09:03 AM', 'FILING_FILE_NAME': '1921655_100.cal', 'FILER_ID': '1334301' }

Checking the official docs https://www.documentcloud.org/documents/2711614-CalAccessTablesWeb.html#document/p121, looks there are several char fields on this table that are way below the prescribed maximum length. Not sure why we would have set them any lower than what the official docs say...

Gonna bump all this up, release a patch, and try resuming the update.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/california-civic-data-coalition/django-calaccess-raw-data/issues/1490, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAnCR5LF-yV1dRYvPgWXoP8ZcI1UsRuks5qjvyUgaJpZM4JuJyN .