ERROR: Could not get partition values for file: hdfs://anahnn/visa/user/chavesrl/chavesrl.db/transuk2m2019_mini/000000_0
ERROR: Could not get partition values for file: hdfs://anahnn/visa/user/chavesrl/chavesrl.db/transuk2m2019_mini/000001_0
ERROR: Could not get partition values for file: hdfs://anahnn/visa/user/chavesrl/chavesrl.db/transuk2m2019_mini/000002_0
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-6-edf72ca4fd46> in <module>
4 hive_table_name = 'transuk2m2019_mini',
5 hive_database_name = 'chavesrl',
----> 6 file_format = 'parquet'
7 )
/projects/gds/chavesrl/condapv/envs/visaverse-gpu/lib/python3.7/site-packages/pyblazing/apiv2/context.py in create_table(self, table_name, input, **kwargs)
2458 ):
2459 parsedMetadata = self._parseMetadata(
-> 2460 file_format_hint, table.slices, parsedSchema, kwargs
2461 )
2462
/projects/gds/chavesrl/condapv/envs/visaverse-gpu/lib/python3.7/site-packages/pyblazing/apiv2/context.py in _parseMetadata(self, file_format_hint, currentTableNodes, schema, kwargs)
2714 schema["names"] = [i.encode() for i in schema["names"]]
2715 if "names" in kwargs:
-> 2716 kwargs["names"] = [i.encode() for i in kwargs["names"]]
2717
2718 if self.dask_client:
/projects/gds/chavesrl/condapv/envs/visaverse-gpu/lib/python3.7/site-packages/pyblazing/apiv2/context.py in <listcomp>(.0)
2714 schema["names"] = [i.encode() for i in schema["names"]]
2715 if "names" in kwargs:
-> 2716 kwargs["names"] = [i.encode() for i in kwargs["names"]]
2717
2718 if self.dask_client:
AttributeError: 'bytes' object has no attribute 'encode'
The table I am trying to read is parquet but specifying that does not helo either, the problem I've found enabling the debugger is that i.encode() is trying to encode i which is already a byte-string.
Expected behavior
Column names being read properly. maybe pyblazing detecting the strings are already encoded
Environment overview (please complete the following information)
Environment location: Bare metal
Method of BlazingSQL install: conda
BlazingSQL Version which can be obtained by doing as follows:
import blazingsql
print(blazingsql.__info__())
BlazingSQL version (git hash): ff4ece0366a4d76bf533baeb03dd03bdfc5232be
BlazingSQL branch name: HEAD
BlazingSQL branch tag: v0.19.0
BlazingSQL build id: 0
BlazingSQL compiler version: GNU /usr/bin/c++ 7.5.0
BlazingSQL cuda flags: -Xcompiler -Wno-parentheses -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_75,code=compute_75 --expt-extended-lambda --expt-relaxed-constexpr -Werror=cross-execution-space-call -Xcompiler -Wall,-Wno-error=deprecated-declarations --default-stream=per-thread -DHT_DEFAULT_ALLOCATOR
BlazingSQL Operating system kernel: Linux-5.4.0-1038-aws
BlazingSQL Operating system architecture: x86_64
BlazingSQL Linux Operating system release: NAME=Ubuntu|VERSION=16.04.7 LTS (Xenial Xerus)|ID=ubuntu|ID_LIKE=debian|PRETTY_NAME=Ubuntu 16.04.7 LTS|VERSION_ID=16.04|HOME_URL=http://www.ubuntu.com/|SUPPORT_URL=http://help.ubuntu.com/|BUG_REPORT_URL=http://bugs.launchpad.net/ubuntu/|VERSION_CODENAME=xenial|UBUNTU_CODENAME=xenial
None
Environment details
Please run and paste the output of the print_env.sh script here, to gather any other relevant environment details
Additional context
Add any other context about the problem here.
----For BlazingSQL Developers----Suspected source of the issue
Where and what are potential sources of the issue
Other design considerations
What components of the engine could be affected by this?
Describe the bug I get the following error when creating a table with a pyhive cursor:
Error:
The table I am trying to read is
parquet
but specifying that does not helo either, the problem I've found enabling the debugger is thati.encode()
is trying to encodei
which is already a byte-string.Expected behavior Column names being read properly. maybe
pyblazing
detecting the strings are already encodedEnvironment overview (please complete the following information)
Environment details Please run and paste the output of the
print_env.sh
script here, to gather any other relevant environment detailsAdditional context Add any other context about the problem here.
----For BlazingSQL Developers---- Suspected source of the issue Where and what are potential sources of the issue
Other design considerations What components of the engine could be affected by this?