OHDSI / SelfControlledCaseSeries

An R package for performing Self-Controlled Case Series (SCCS) analyses in an observational database in the OMOP Common Data Model.
http://ohdsi.github.io/SelfControlledCaseSeries
13 stars 8 forks source link

Use hash for exposures_outcome_id #58

Closed schuemie closed 3 months ago

schuemie commented 3 months ago

Currently the exposures_outcome_id is auto-generated sequentially, starting at 1. This can cause issues when a set of SCCS analyses is distributed over multiple machines and later combined in a single results schema.

One way to resolve this is to use a hash of the exposures, outcome, and nesting cohort IDs as the exposures_outcome_id instead. For this we could use the Murmur32 hash in the digest package, which is optimized to avoid collisions.