Open johnb30 opened 10 years ago
Do you want them to be sequential/meaningful or can we just do an MD5 hash?
I thought about doing MD5 hashes for something like URL + date, but it might be more useful to have something sequential and meaningful. Easy answer is why not both?
Or just hash the text. It's fast. I'm fine with both (meaningful ID vs. definitely unique ID).
On a semi-related note, can we switch the date to YYYYMMDD rather than YYMMDD? It's closer to ISO and I think it's easier to read and it's much easier to convert into ISO later. I'm afraid that would break some things but it's something to think about.
I'm fine with hashing the text and putting in both sequential and fully unique ID. I'm also fine with the 8-digit date rather than the 6. I wonder if @philip-schrodt has any input on this? It should be just a matter of changing the format at https://github.com/openeventdata/phoenix_pipeline/blob/master/phox_pipeline.py#L27.
Need to add in a global, unique ID for each event record.