ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
4.42k stars 544 forks source link

[MapD] Add geo spatial datatype support #1666

Closed xmnlab closed 5 years ago

xmnlab commented 5 years ago

Resolves partial #1665

MapD GEO spatial functions support should be addressed in another PR

xmnlab commented 5 years ago

@cpcloud @kszucs

I am moving the geospatial types to it own file. now I have a question. Inside geospatial.py I will import datatypes to create geo spatial data types. so where I should import/add the spatial data? My first guess is to add this inside datatype.py but it will create a circular import ... so I think it is not desired.

what would be the workflow for this implementation?

xmnlab commented 5 years ago

@cpcloud @kszucs

I moved the geospatial types to it own file. now I have a question. Inside geospatial.py I will import datatypes to create geo spatial data types. so where I should import/add the spatial data? My first guess is to add this inside datatype.py but it will create a circular import ... so I think it is not desired.

what would be the workflow for this implementation?

xmnlab commented 5 years ago

@cpcloud @kszucs we decide to remove the dependence of shapely for now, so I moved back again geo spatial data type from geospatial.py to datatype.py and I removed geospatial.py

just testing the new data types it seems it is working. but It seems I am missing something related to inference ... could you provide any help?

thanks!

kszucs commented 5 years ago

@xmnlab Try removing the infer functions above.

xmnlab commented 5 years ago

@kszucs thanks for the feedback!

with out infer function, it raises an error:

---------------------------------------------------------------------------
InputTypeError                            Traceback (most recent call last)
<ipython-input-3-26e2613b0aa2> in <module>
      1 point = (0, 0)
----> 2 l_point = ibis.literal(point, type='point')
      3 
      4 print(l_point.compile())
      5 print(type(l_point))

/mnt/sda1/dev/quansight/ibis/ibis/expr/types.py in literal(value, type)
    894         dtype = dt.null
    895     else:
--> 896         dtype = dt.infer(value)
    897 
    898     if type is not None:

/mnt/sda1/storage/miniconda/envs/ibis/lib/python3.6/site-packages/multipledispatch/dispatcher.py in __call__(self, *args, **kwargs)
    276             self._cache[types] = func
    277         try:
--> 278             return func(*args, **kwargs)
    279 
    280         except MDNotImplementedError:

/mnt/sda1/dev/quansight/ibis/ibis/expr/datatypes.py in infer_dtype_default(value)
   1189 @infer.register(object)
   1190 def infer_dtype_default(value):
-> 1191     raise com.InputTypeError(value)
   1192 
   1193 
xmnlab commented 5 years ago

@kszucs

I could resolve the problem here:

let me know if this is reasonable or if you prefer another approach.

kszucs commented 5 years ago

That sounds good!

xmnlab commented 5 years ago

@kszucs it seems all tests passed except for azure (mysql installation issue) and python27 (sqlite issues)

do you have any idea how to fixed that?

kszucs commented 5 years ago

MySQL testing has been disabled on the master. I'm rebasing it.

codecov[bot] commented 5 years ago

Codecov Report

Merging #1666 into master will decrease coverage by 2.56%. The diff coverage is 93.9%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1666      +/-   ##
=========================================
- Coverage   89.97%   87.4%   -2.57%     
=========================================
  Files         186     186              
  Lines       27300   27486     +186     
  Branches     2311    2344      +33     
=========================================
- Hits        24563   24024     -539     
- Misses       2335    3050     +715     
- Partials      402     412      +10
Impacted Files Coverage Δ
ibis/mapd/client.py 50.85% <ø> (ø) :arrow_up:
ibis/expr/tests/test_decimal.py 100% <ø> (ø) :arrow_up:
ibis/expr/tests/test_value_exprs.py 99.5% <100%> (+0.01%) :arrow_up:
ibis/expr/types.py 91.77% <100%> (+0.5%) :arrow_up:
ibis/expr/tests/test_datatypes.py 100% <100%> (ø) :arrow_up:
ibis/mapd/tests/test_operations.py 98.11% <100%> (+0.61%) :arrow_up:
ibis/expr/api.py 93.06% <100%> (-0.38%) :arrow_down:
ibis/mapd/operations.py 72.59% <83.33%> (+1.17%) :arrow_up:
ibis/expr/datatypes.py 94.87% <93.47%> (-0.2%) :arrow_down:
ibis/bigquery/tests/test_client.py 25.87% <0%> (-73.55%) :arrow_down:
... and 18 more
xmnlab commented 5 years ago

Thanks @kszucs !! It seems awesome!

kszucs commented 5 years ago

@xmnlab @cpcloud In overall the new datatypes are working, however We'll need a couple of follow-up PRs.

The current implementation doesn't reflect the hierarchy between the spatial types:

Point = Array[Numeric, 2]
Line = Array[Point]
Polygon = Array[Line]
Multypolygon = Array[Polygon]

And the implicit casting rules need to be more restrictive as well.

xmnlab commented 5 years ago

@cpcloud @kszucs

I was thinking, I will change the geospatial literals to accept 2 new arguments, maybe something like (option 1):

literal((1, 2), type='point;4326:geography')

when geotype (geography) and srid (4326) are optionals.

is this pattern ok for geo types? I am trying to do something more closely to the postgis literal, example:

'SRID=4326;POINT(1 2)'::geography

or should I use something like the other ibis types, using () or <> ? Example (option 2):

literal((1, 2), type='point<geography>(4326))
xmnlab commented 5 years ago

hey @kszucs @cpcloud

any thought about the last comment (https://github.com/ibis-project/ibis/pull/1666#issuecomment-443954609) ?

xmnlab commented 5 years ago

@cpcloud @kszucs any feedback?

xmnlab commented 5 years ago

that is awesome! thanks @kszucs !

xmnlab commented 5 years ago

@cpcloud any thought about this PR?

xmnlab commented 5 years ago

@cpcloud any thought about this PR?

kszucs commented 5 years ago

@xmnlab please rebase

xmnlab commented 5 years ago

@kszucs @cpcloud I rebased it from master. could this PR be merged?

xmnlab commented 5 years ago

@cpcloud @kszucs any feedback about this PR?

kszucs commented 5 years ago

@xmnlab please add an unreleased version to https://github.com/ibis-project/ibis/blob/a16772aea43a936cadc39046ab96d3f54526ecdf/docs/source/release.rst

With an "Experimental GeoSpatial datatype support" (or something like that) entry and a reference to the appropiate issue.

kszucs commented 5 years ago

The coverage is good now too. Thanks @xmnlab!