catalyst-cooperative / ccai-entity-matching

An exploration of generalizable approaches to unsupervised entity matching for use in linking tabular public energy data sources.
MIT License
1 stars 2 forks source link

Evaluate TF-IDF Attribute Embedding #33

Closed zaneselvans closed 8 months ago

zaneselvans commented 1 year ago

Use TF-IDF to vectorize string features, and then test standard linkage performance with:

## Tuple Embedding Methods
- [ ] #35 
- [ ] Weighted Aggregation (with training data)
- [ ] autoencoder (neural network)
- [ ] seq2seq (neural network