TheDataStation / ver

Data Discovery Tools and Systems
MIT License
6 stars 10 forks source link

dindex_builder compilation error when using python 3.8.10 #43

Closed snowgy closed 1 year ago

snowgy commented 1 year ago

The minimum version of Python 3.8 that can be installed on Apple M1 is Python 3.8.10.

When I ran the following command to build indices on top of data profiles with python 3.8.10

python3 dindex_builder.py --profile_data_path ddprofiler/output_profiles_json

I got the following error

DIndex Builder
Building DIndex. profile_data_path: ddprofiler/output_profiles_json
Initializing profile index: duckdb...
Traceback (most recent call last):
  File "dindex_builder.py", line 104, in <module>
    dindex = build_dindex(args.profile_data_path, cnf, force=args.force)
  File "dindex_builder.py", line 17, in build_dindex
    dindex = DiscoveryIndex(config, force=force)
  File "/Users/yuegong/Documents/ver/dindex_store/discovery_index.py", line 52, in __init__
    self.__profile_index = DiscoveryIndex.profile_index_mapping[profile_index](config, load=load, force=force)
TypeError: Can't instantiate abstract class ProfileIndexDuckDB with abstract methods get_profile

To overcome the problem, I have created a temporary solution by developing a mock implementation of the "get_profile" function in the ProfileIndexDuckDB. After rerunning

python3 dindex_builder.py --profile_data_path ddprofiler/output_profiles_json

I got the following error.

DIndex Builder
Building DIndex. profile_data_path: ddprofiler/output_profiles_json
Initializing profile index: duckdb...
Initializing content similarity index: simpleminhash...
Initializing FTS index: duckdb...
error when removing an existing fts index: Catalog Error: a FTS index does not exist on table 'main.fts_data'. Create one with 'PRAGMA create_fts_index()'.
Initializing Graph index: kuzu...
An error has occurred when reading the schema
Traceback (most recent call last):
  File "dindex_builder.py", line 104, in <module>
    dindex = build_dindex(args.profile_data_path, cnf, force=args.force)
  File "dindex_builder.py", line 17, in build_dindex
    dindex = DiscoveryIndex(config, force=force)
  File "/Users/yuegong/Documents/ver/dindex_store/discovery_index.py", line 61, in __init__
    self.__graph_index = DiscoveryIndex.graph_index_mapping[graph_index](config, load=load, force=force)
  File "/Users/yuegong/Documents/ver/dindex_store/graph_index_kuzu.py", line 42, in __init__
    self.conn.execute(statement)
  File "/Users/yuegong/Documents/ver/venv/lib/python3.8/site-packages/kuzu/connection.py", line 73, in execute
    self._connection.execute(
RuntimeError: Parser exception: mismatched input 'Column' expecting {HexLetter, UnescapedSymbolicName, EscapedSymbolicName} (line: 1, offset: 18)
"CREATE NODE TABLE Column(id INT64, PRIMARY KEY (id))"
                   ^^^^^^