dask-contrib / dask-sql

Distributed SQL Engine in Python using Dask
https://dask-sql.readthedocs.io/
MIT License
397 stars 72 forks source link

SchemaError / NotImplementedError: The python type string is not implemented (yet) #1247

Open orlandombaa opened 1 year ago

orlandombaa commented 1 year ago

Hello

I start to use dask- sql but I cant make any simple query, I can just make a total selection with select * from df;. Beside this query I cant do anything else, in every query I get the samme error: SchemaError.

Like the following example:

import pandas as pd 
import dask.dataframe as dd
from dask_sql import Context

# Crear un pandas DF 
test = {"Nombre":["Orlando", "Fernando", "Rosario", "Cuah", "Verónica"],
        "Sexo":["M", "M","F","M","F"], 
        "Edad":[30,40,50,60,56]}

test=pd.DataFrame(data=test)

# Creamos un dask df a partir de un pandas df
test= dd.from_pandas(data=test, npartitions=2)
print("Tipo de objeto:", type(test))
test.head()

#  Crea un contexto para dask-sql
c = Context()

# Registra el DataFrame de Dask en el contexto para poder referenciarlo en las consultas SQL
c.create_table(table_name="test", input_table=test)

result = c.sql("""
    SELECT
        Nombre
    FROM test
""")
result.compute()

Then I get this error: ParsingException: SchemaError(FieldNotFound { field: Column { relation: None, name: "nombre" }, valid_fields: [] })

I get the same error in more complex dataframes that I am using. Can some one help me understand why this happen?

orlandombaa commented 1 year ago

Ahora con el mismo codigo me sale el siguiente error:

NotImplementedError: The python type string is not implemented (yet)

orlandombaa commented 1 year ago

Hello ! I just solve this problem. For some reason which I would love to have some information from you comunity this code can run perfectly if I create my DF with the name of the columns in lowercase. If I create this df in lowercase and query it in lowercase all works perfect.