Open nikkatalnikov opened 3 years ago
Hey Nik,
I'm out of the office for a few weeks but will check this asap when I'm back. Apologies for the delay!
On Thu, 6 May 2021, 00:17 Nik Katalnikov, @.***> wrote:
- Dremio client version: 0.14.0
- Dremio version: 14.0.0
- Python version: 3.8
- Operating System: Mac OS 10.15.7 Catalina (Dockerized)
Description
I am trying to fetch physical datasets info via simple client API, but look like it only returns VIRTUAL_DATASETs.
client = init(simple_client=True) catalog_raw_api_data = client.catalog() c_ids = map(lambda x: (x['id'], x['path']), catalog_raw_api_data['data']) for (c_id, c_path) in c_ids: catalog_item = client.catalog_item(c_id, c_path) entity_type = catalog_item.get('entityType') print(catalog_item)
Alternatively, via DremiClient I can't get anything:
client = init() catalog = client.data pds = catalog.source.pds.get()
this renders an error:
Traceback (most recent call last): File "/Users/nikkatalnikov/opt/anaconda3/envs/flowtale/lib/python3.8/site-packages/dremio_client/model/data.py", line 484, in getattr value = dict.getitem(self, item) KeyError: 'source'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/Users/nikkatalnikov/Desktop/projetcs/flowtale/flowtale-acc/application/dremio-datalake/dremio-clonner/dremio-exporter.py", line 8, in
pds = catalog.source.pds.get() File "/Users/nikkatalnikov/opt/anaconda3/envs/flowtale/lib/python3.8/site-packages/dremio_client/model/data.py", line 492, in getattr return dict.getitem(self, item) KeyError: 'source' What am I doing wrong? Thank you!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rymurr/dremio_client/issues/234, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPNXII6QCZX7Y3R433KITLTMG7XPANCNFSM44F3IZCQ .
@rymurr do you have an update on the issue? :)
Hey @nikkatalnikov sorry for the delay!
I ran the following against a clean Dremio 14.5.0
python -c "from dremio_client import init;c=init();[print(c.data[i]) for i in c.data]"|jq
{
"entityType": "home",
"id": "e37a0e32-919e-4edf-a54d-9e812a08bce6",
"name": null,
"tag": "qDM283kE6Og=",
"path": [
"@dremio"
],
"accessControlList": null
}
{
"entityType": "source",
"id": "cf7dd756-b37d-46a4-9662-6805ded0f8ee",
"name": null,
"description": null,
"tag": "Hqes8XM1Bnw=",
"type": "CONTAINER",
"config": null,
"createdAt": "2021-06-01T08:27:55.338Z",
"metadataPolicy": null,
"state": null,
"accelerationGracePeriodMs": null,
"accelerationRefreshPeriodMs": null,
"accelerationNeverExpire": null,
"accelerationNeverRefresh": null,
"path": [
"Samples"
],
"accessControlList": null
}
and when looking at the pds's in teh sample source:
python -c "from dremio_client import init;c=init();[print(c.data.Samples.samples_dremio_com[i].get()) for i in c.data.Samples.samples_dremio_com.get()]"|jq
{
"entityType": "file",
"id": "dremio:/Samples/samples.dremio.com/\"SF weather 2018-2019.csv\"",
"path": [
"Samples",
"samples.dremio.com",
"\"SF weather 2018-2019.csv\""
],
"accessControlList": null
}
{
"entityType": "dataset",
"id": "978dd231-abb8-4ae1-8c6f-1073d9e2d211",
"path": [
"Samples",
"samples.dremio.com",
"SF_incidents2016.json"
],
"tag": "0zNDJreBWoA=",
"type": "PHYSICAL_DATASET",
"fields": [
{
"name": "IncidntNum",
"type": {
"name": "VARCHAR"
}
},
{
"name": "Category",
"type": {
"name": "VARCHAR"
}
},
{
"name": "Descript",
"type": {
"name": "VARCHAR"
}
},
{
"name": "DayOfWeek",
"type": {
"name": "VARCHAR"
}
},
{
"name": "Date",
"type": {
"name": "VARCHAR"
}
},
{
"name": "Time",
"type": {
"name": "VARCHAR"
}
},
{
"name": "PdDistrict",
"type": {
"name": "VARCHAR"
}
},
{
"name": "Resolution",
"type": {
"name": "VARCHAR"
}
},
{
"name": "Address",
"type": {
"name": "VARCHAR"
}
},
{
"name": "X",
"type": {
"name": "VARCHAR"
}
},
{
"name": "Y",
"type": {
"name": "VARCHAR"
}
},
{
"name": "Location",
"type": {
"name": "VARCHAR"
}
},
{
"name": "PdId",
"type": {
"name": "BIGINT"
}
}
],
"createdAt": "2021-06-01T08:33:29.829Z",
"accelerationRefreshPolicy": null,
"sql": null,
"sqlContext": null,
"format": {
"type": "JSON",
"fullPath": [
"Samples",
"samples.dremio.com",
"SF_incidents2016.json"
],
"ctime": 0,
"isFolder": false,
"location": "/samples.dremio.com/SF_incidents2016.json"
},
"approximateStatisticsAllowed": null,
"accessControlList": null
}
{
"entityType": "file",
"id": "dremio:/Samples/samples.dremio.com/\"zip_lookup.csv\"",
"path": [
"Samples",
"samples.dremio.com",
"\"zip_lookup.csv\""
],
"accessControlList": null
}
{
"entityType": "file",
"id": "dremio:/Samples/samples.dremio.com/\"zips.json\"",
"path": [
"Samples",
"samples.dremio.com",
"\"zips.json\""
],
"accessControlList": null
}
{
"entityType": "folder",
"id": "dremio:/Samples/samples.dremio.com/\"Dremio University\"",
"path": [
"Samples",
"samples.dremio.com",
"\"Dremio University\""
],
"tag": null,
"accessControlList": null
}
{
"entityType": "folder",
"id": "dremio:/Samples/samples.dremio.com/\"NYC-taxi-trips\"",
"path": [
"Samples",
"samples.dremio.com",
"\"NYC-taxi-trips\""
],
"tag": null,
"accessControlList": null
}
Description
I am trying to fetch physical datasets info via simple client API, but look like it only returns VIRTUAL_DATASETs.
Alternatively, via DremiClient I can't get anything:
this renders an error:
What am I doing wrong? Thank you!