catalyst-cooperative / ccai-entity-matching

An exploration of generalizable approaches to unsupervised entity matching for use in linking tabular public energy data sources.
MIT License
1 stars 2 forks source link

Set up record-linkage experiment infrastructure #37

Closed zaneselvans closed 7 months ago

zaneselvans commented 1 year ago

We've got a bunch of (potential) experiments that we want to compare, so setting up a framework for running them all in repeatable way will be helpful

- [x] Switch to pulling nightly build DB from S3 rather than datasette (#41 )
- [x] #38 (#41)
- [x] #39 (#41)
- [ ] #32
- [ ] https://github.com/catalyst-cooperative/ccai-entity-matching/issues/48
- [ ] unpin `pudl` dependency from commit and use `main` once nightly builds are running on `main`
- [ ] integrate utility name merge onto PPL into the PUDL `plant_parts_eia` analysis