Closed pcattori closed 4 years ago
Here is a foundation of some code that would probably be used for such a tutorial:
from getpass import getpass
import tamr_client as tc
username = input("Tamr Username:")
password = getpass("Tamr Password:")
auth = tc.UsernamePasswordAuth(username, password)
session = tc.session.from_auth(auth)
instance = tc.Instance(host="localhost", port=9100)
project_id = "1" # replace with your project ID
project = tc.project.from_resource_id(project_id)
def check(op: tc.Operation):
if not tc.operation.succeeded(op):
raise RuntimeError("Operation failed.")
return op
check(tc.mastering.update_unified_dataset(session, project))
check(tc.mastering.generate_pairs(session, project))
check(tc.mastering.apply_feedback(session, project))
check(tc.mastering.update_pair_results(session, project))
check(tc.mastering.update_high_impact_pairs(session, project))
check(tc.mastering.update_cluster_results(session, project))
check(tc.mastering.publish_clusters(session, project))
Existing docs: https://docs.tamr.com/tamr-tutorials/docs/overview-mastering
🙋 feature request
We need a tutorial that shows how to keep a Mastering project up-to-date with new data/labels.
🔦 Context
Mastering projects are extremely common, but the workflow is complex. We need a guide to show users how to manage their existing mastering projects programmatically.
Tasks