Datatamer / tamr-client

Programmatically interact with Tamr
https://tamr-client.readthedocs.io
Apache License 2.0
11 stars 25 forks source link

TC: Move tc.attribute.from_dataset_all to tc.dataset.attributes #428

Closed skalish closed 4 years ago

skalish commented 4 years ago

↪️ Pull Request

This PR moves the function tamr_client.attribute.from_dataset_all to tamr_client.dataset.attributes. Organizing the code to group together functionality around what can be asked about or what can be done with a dataset is more natural than grouping together the different ways a user might get attributes (e.g. from_dataset, from_project).

Closes #418 .

This PR also makes a minor modification to the dev task format so that the black flag --diff can be used.

💻 Examples

s = fake.session()
dataset = fake.dataset()

attrs = tc.dataset.attributes(s, dataset)

row_num = attrs[0]
attr_1 = attrs[1]
attr_2 = attrs[2]

✔️ PR Todo

skalish commented 4 years ago

I could roll updating the dataset tests to use the fake utility into this PR if we want to continue that conversion. (#423)

pcattori commented 4 years ago

I could roll updating the dataset tests to use the fake utility into this PR if we want to continue that conversion. (#423)

@skalish : Yes, we should be migrating tests to use fake utility as we touch them. This will require adding ndjson capabilities to fake.json. I'm envisioning that the fake JSON file can specify {..., "response": { "ndjson": [...] } } and the list supplied in fake_json["response"]["ndjson"] gets used to set up responses with a response stream.

skalish commented 4 years ago

Still working on migrating the attribute tests to use fake.json, so a merge should wait a bit longer.

skalish commented 4 years ago

I’m getting stuck on one last thing for updating the attribute testing. With the changes I’ve made, there is still a heavy reliance on loading tests/tamr_client/data/attribute.json and tests/tamr_client/data/attributes.json. In one place, this is used in the test_from_json function that is not about interacting with the server and I’m not sure how to change that to not use this file. On the other hand, it is also used in the tests to compare the response from the server to the Attribute that is generated using _from_json with the file’s contents https://github.com/Datatamer/tamr-client/blob/0a1b54d5e0f158f22cc0ccbea5d7b9914548524c/tests/tamr_client/attribute/test_attribute.py#L61 I think the way the project tests are written is neater, where some attributes are checked against hard-coded values https://github.com/Datatamer/tamr-client/blob/0a1b54d5e0f158f22cc0ccbea5d7b9914548524c/tests/tamr_client/test_project.py#L13-L15 but without resolving test_from_json it wouldn’t let us drop those (now kind of redundant) JSON files.

skalish commented 4 years ago

The tests that make use of fake.json no longer also load a JSON from tests/tamr_client/data, requiring redundant data in multiple locations. The two tests that do not use fake.json still load attributes.json, so that file has not been removed (it is used by tests in tests/tamr_client/attribute/test_type.py).