After combining, I zipped up Cam_ContribExpenditure.csv and dropped it in the /data folder. However, the import failed.
After some investigating, the DateAdded field was missing from Cam_ContribExpenditure_2.csv, so I opened up LibreOffice and added the column. The column could not be blank, so I copied the date data from DateContribution into this column.
I re-ran the import, but the import failed again. This is the error I got:
sqlalchemy.exc.DataError: (psycopg2.errors.StringDataRightTruncation) value too long for type character varying(255)
[SQL:
UPDATE camp_fin_transaction SET
id=s."contribexpenditureid"::bigint, contact_id=s."contactid"::bigint, amount=s."amount"::money::numeric::double precision, received_date=s."datecontribution"::timestamp with time zone, date_added=s."dateadded"::timestamp with time zone, check_number=s."checknumber"::varchar, memo=s."memo"::text, description=s."description"::varchar, transaction_type_id=s."contribexpendituretypeid"::bigint, filing_id=s."reportid"::bigint, olddb_id=s."olddbcontribexpenditureid"::bigint, name_prefix=s."prefix"::varchar, first_name=s."firstname"::varchar, middle_name=s."middlename"::varchar, last_name=s."lastname"::varchar, suffix=s."suffix"::varchar, company_name=s."companyname"::varchar, address=s."address"::varchar, city=s."city"::varchar, state=s."state"::varchar, zipcode=s."zip"::varchar, county_id=s."countyid"::bigint, country=s."country"::varchar, contact_type_id=s."contacttypeid"::bigint, transaction_status_id=s."transactionstatusid"::bigint, from_file_id=s."fromfileid"::bigint, contact_type_other=s."contacttypeother"::varchar, occupation=s."occupation"::varchar, expenditure_for_certified_candidate=s."isexpenditureforcertifiedcandidate"::boolean
FROM (
SELECT raw."contribexpenditureid", raw."contactid", raw."amount", raw."datecontribution", raw."dateadded", raw."checknumber", raw."memo", raw."description", raw."contribexpendituretypeid", raw."reportid", raw."olddbcontribexpenditureid", raw."prefix", raw."firstname", raw."middlename", raw."lastname", raw."suffix", raw."companyname", raw."address", raw."city", raw."state", raw."zip", raw."countyid", raw."country", raw."contacttypeid", raw."transactionstatusid", raw."fromfileid", raw."contacttypeother", raw."occupation", raw."isexpenditureforcertifiedcandidate"
FROM raw_transaction AS raw
JOIN change_transaction AS change
ON raw."contribexpenditureid" = change.id
) AS s
WHERE camp_fin_transaction.id = s."contribexpenditureid"
]
It looks like one of the fields was more than 255 characters. This was likely do to a data formatting issue.
To diagnose, I fired up psql since the data was already in the raw_transaction table. I looked for any rows that were supposed to load into a field with a 255 character limit but were longer than that:
SELECT raw.*
FROM raw_transaction AS raw
JOIN change_transaction AS change
ON raw."contribexpenditureid" = change.id
WHERE
length(raw."firstname") > 255 OR
length(raw."middlename") > 255 OR
length(raw."lastname") > 255 OR
length(raw."companyname") > 255 OR
length(raw."address") > 255 OR
length(raw."occupation") > 255
Sure enough, there was one row with a glitch in the address field:
I loaded up the Cam_ContribExpenditure.csv file in Sublime Text (combined it was too large for LibreOffice) and searched for the corrupted line and fixed it by hand.
Re zipped Cam_ContribExpenditure.csv, dropped it in the /data folder, and re-ran the import:
Another round of data from the New Mexico Secretary of State. This is likely the last import before they transition to a new system.
data notes
The data provided is the
Cam_ContribExpenditure.csv
data. The file sent was a zip file that contained 2 files:Here are the steps I took to prep and import the data. I used
csvkit
,sql
, LibreOffice and Sublime Textin2csv
in2csv Cam_ContribExpenditure_1.xlsx > Cam_ContribExpenditure_1.csv
in2csv Cam_ContribExpenditure_2.xlsx > Cam_ContribExpenditure_2.csv
csvstack
csvstack Cam_ContribExpenditure_1.csv Cam_ContribExpenditure_2.csv > Cam_ContribExpenditure.csv
After combining, I zipped up
Cam_ContribExpenditure.csv
and dropped it in the/data
folder. However, the import failed.DateAdded
field was missing fromCam_ContribExpenditure_2.csv
, so I opened upLibreOffice
and added the column. The column could not be blank, so I copied the date data fromDateContribution
into this column.I re-ran the import, but the import failed again. This is the error I got:
It looks like one of the fields was more than 255 characters. This was likely do to a data formatting issue.
psql
since the data was already in theraw_transaction
table. I looked for any rows that were supposed to load into a field with a 255 character limit but were longer than that:Sure enough, there was one row with a glitch in the address field:
I loaded up the
Cam_ContribExpenditure.csv
file in Sublime Text (combined it was too large for LibreOffice) and searched for the corrupted line and fixed it by hand.Cam_ContribExpenditure.csv
, dropped it in the/data
folder, and re-ran the import:python manage.py import_data
This time, success!