man-group / ArcticDB

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.
http://arcticdb.io
Other
1.51k stars 93 forks source link

If you pass a list with columns to read() or ReadRequest() arctic will return result even if one or more of the columns are wrong i.t. does not exist #2005

Open grusev opened 6 days ago

grusev commented 6 days ago

Describe the bug

If you request ArcticDBt to read only columns A, B, C, D you will get result even if one of the columns (ore more) do not exist. You will get only columns that exist.

That is nice thing perhaps, but it is not documented.

If you ask Pandas same thing you will get error.

More over with ArcticDB if you create Query with non existing column you will get error also

Thus this behavior is rather inconsistent, although acceptable if document well along with #

Steps/Code to Reproduce

def q(q):
    return q[q["bool"]]

lib = arctic_library

symbol = "sym"
df = get_sample_dataframe(size=100)
df.reset_index(inplace = True, drop = True)
columns = ['wrong', 'int32', 'float64', 'strings', 'bool', 'wrong']

lib.write(symbol, df)

batch = lib.read_batch(symbols=[ReadRequest(symbol, as_of=0, query_builder=q(QueryBuilder()), columns=columns)])

assert isinstance(batch[0], DataError)    

Expected Results

will expect ArcticDB to raise error.

OS, Python Version and ArcticDB Version

any

Backend storage used

No response

Additional Context

No response