heavyai / rbc

Remote Backend Compiler
https://rbc.readthedocs.io
BSD 3-Clause "New" or "Revised" License
29 stars 10 forks source link

Input format for WKT for geometric types. #22

Open Raikao opened 4 years ago

Raikao commented 4 years ago

What would the correct syntax implementing a concatenation of strings as a UDF in OmniSci? It's not clear the relation between data types across python, numba and OmniSci when it comes to strings.

pearu commented 4 years ago

Currently, OmniSci UDFs has support only for the following first-class types:

bool, int8, int16, int32, int64, float, double

and arrays of these types. So, string argument types and manipulations with strings are not supported within UDF definitions, ATM.

In general, the aim of RBC is to translate Python function definitions to LLVM IR form that is sent to OmniSci server where the LLVM IR is linked to the SQL Query Engine. For the translation, numba is used. The relation between the datatypes of Python, Numba, and OmniSci SQL is as follows:

  1. Python int and float map to SQL INTEGER and DOUBLE, respectively.
  2. numba int8, int16, int32, int64 map to SQL TINYINT, SMALLINT, INTEGER, BIGINT, respectively.
  3. Python/numba bool is mapped to SQL BOOLEAN.
Raikao commented 4 years ago

I see, so as long as there's no support from the OmniSci SQL engine to send strings through the LLVM, that functionality won't work.

Thanks for the explanation!

Raikao commented 4 years ago

I noticed that the OmniSci 5.1 release added support for UDF on geometric types. What is the input format to work with that? Have you had a chance to document this feature?