Closed ajhyndman closed 2 years ago
checked a few things, including that taking the first part of the uuidv4 string is the random part, that there's no other obvious random id generator in postgres, and that a base64-encoded uuid still looks pretty bad (so the first part of the uuid string seems better in comparison). the urls are going to be a bit long and scary, and i'm not a huge fan of the URL and the UI disagreeing (which might make base64-encoding a better option, we can have the same thing in the URL and in the UI and just decode the id on load) but not really blocking.
Awesome, thanks for looking over this.
Yeah, as far as I can tell, postgres has the best support for UUID keys. I'm definitely open to trying out base64 encoding in the UI thuogh.
First, this change is totally optional! I don't think it's that bad if we never expunge auto-incrementing IDs from the ARPA reporter.
However, if we are interested in moving toward UUID (or other string-like) keys, I think uploads is probably the most valuable table to migrate now. This is because once we have real reporting cycles underway, upload IDs will exist in quite a few places.
Upload IDs as foreign keys can be migrated, but upload IDs will also end up in upload files persisted on disk as well as in upload links in audit reports. Resolving all discrepancies after-the-fact may never be completely possible.
If we do tackle this change now, we should have one less set of ID conflicts to deal with whenever we merge RI, CT and OH instances. 👍
@igor47 if you do manage to take a look at this, I'd especially like your thoughts on what to do with the UI. Full UUIDs are kind of clunky to render to text.
NOTE: I think it's also interesting to consider something other than a full UUID for our keys. I don't think we're ever expecting millions of uploads in the system. We could use a git-style md5 hash, for instance?