opencivicdata / python-opencivicdata

python utilities for Open Civic Data
BSD 3-Clause "New" or "Revised" License
34 stars 28 forks source link

Are bill_actions unique within a bill, and should they be? #114

Closed estaub closed 6 years ago

estaub commented 6 years ago

I'm working an OpenStates issue that leads me here.

In importing a bill via pupa, the exception pupa.exceptions.DataImportError: duplicate key value violates unique constraint "opencivicdata_voteevent_bill_action_id_key" is being evoked. I infer that this is caused by the same bill action being on two different votes for the same bill, probably enforced by DDL generated by this line. If the exception is really something else entirely, read no further, sorry to waste your time.

This is happening on a bill with a loop in its process; it was re-referred to committee after going to the floor. As a result, some bill actions were duplicated.

I can see two ways to go to fix this, but don't have the use-case experience to recommend one over the other.

One fix is to leave the model as is, and require data generators like OpenStates to synthesize unique bill actions. For example, in the example linked above, the two occurrences of 2nd Reading Passed might become [2nd Reading Passed, 2nd Reading Passed (2)].

The other fix would be to relax the uniqueness requirement here.

jamesturk commented 6 years ago

this is being addressed in a PR for opencivicdata/pupa#307