Datatamer / tamr-client

Programmatically interact with Tamr
https://tamr-client.readthedocs.io
Apache License 2.0
11 stars 25 forks source link

Create new dataset #67

Closed DerrickRice closed 5 years ago

DerrickRice commented 5 years ago

💬 RFC

Trying to upload new data. I would prefer to create a new dataset, but the API doesn't support that. Instead, I need to use Dataset.update_records.

I would like to be able to create a new dataset.

🔦 Context

Concerns with updating into an existing dataset:

It appears that the server's API does support the creation of new datasets, but this isn't provided in the python client. https://docs.tamr.com/reference#create-dataset

💁 Possible Solution

def create_dataset(unify, dataset_config):
    """
    Create a dataset in Unify

    :param unify: Unify Client
    :param dataset_config: Dataset Configuration
    :return: the created Dataset
    """
    from tamr_unify_client.models.dataset.resource import Dataset
    data = unify.post(unify.datasets.api_path, json=dataset_config).successful().json()
    return Dataset(unify, data, data["relativeId"])
pcattori commented 5 years ago

Related to #137

pcattori commented 5 years ago

@mollysacks this is asking for a convenience function that does:

  1. (Empty) dataset creation
  2. Set dataset schema
  3. Upload all records to that dataset
pcattori commented 5 years ago

Design should be Collection#create e.g. DatasetCollection.create, not Project#create_dataset or Client#create_dataset.

@mollysacks this is relevant for #150