ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
5.28k stars 595 forks source link

feat: Support for `ibis.dtype("geospatial:geometry")` #10170

Closed Riezebos closed 1 month ago

Riezebos commented 1 month ago

Is your feature request related to a problem?

I wanted to store the schema of a table in a json file so that I can easily compare/convert new data extractions to the existing schema. If I try to convert the schema to a regular dictionary and back I get an error:

pairs = {n: str(t) for n, t in table.schema().fields.items()}
ibis.schema(pairs)

Output:

SignatureValidationError: Schema({'id': 'String', 'created_at': 'String', 'updated_at': 'String', 'removed': 'Boolean', 'description': 'Null', 'location': 'GeoSpatial', 'user_by_created_by': 'Struct', 'campaign': 'Struct', 'postertype': 'Null'}) has failed due to the following errors:
  `fields`: {'id': 'String', 'created_at': 'String', 'updated_at': 'String', 'removed': 'Boolean', 'description': 'Null', 'location': 'GeoSpatial', 'user_by_created_by': 'Struct', 'campaign': 'Struct', 'postertype': 'Null'} is not matching GenericMappingOf(key=InstanceOf(type=<class 'str'>), value=CoercedTo(type=<class 'ibis.expr.datatypes.core.DataType'>, func=<bound method DataType.__coerce__ of <class 'ibis.expr.datatypes.core.DataType'>>), type=CoercedTo(type=<class 'ibis.common.collections.FrozenOrderedDict'>, func=<class 'ibis.common.collections.FrozenOrderedDict'>))

Expected signature: Schema(fields: FrozenOrderedDict[str, DataType])

If I remove geospatial: from the type, it works:

pairs = {n: str(t).replace("geospatial:", "") for n, t in table.schema().fields.items()}
ibis.schema(pairs=pairs)

Output:

ibis.Schema {
  id                  string
  created_at          string
  updated_at          string
  removed             boolean
  description         null
  location            geospatial:geometry
  user_by_created_by  struct<email: string, id: string, name: string>
  campaign            struct<country: string, election: struct<electionDay: string, id: string, name: string>, id: string, name: string, region: string>
  postertype          null
}

What is the motivation behind your request?

No response

Describe the solution you'd like

I'd like all string conversions of datatypes to convert back to the original type when used in ibis.dtype()

What version of ibis are you running?

9.5.0

What backend(s) are you using, if any?

DuckDB with spatial extension

Code of Conduct

cpcloud commented 1 month ago

Thanks for the report!

We should definitely support this.