GoogleCloudPlatform / professional-services-data-validator

Utility to compare data between homogeneous or heterogeneous environments to ensure source and target tables match
Apache License 2.0
404 stars 117 forks source link

AttributeError: 'NoneType' object has no attribute 'split' #987

Open florisvink opened 1 year ago

florisvink commented 1 year ago

For some tables I got the following error: AttributeError: 'NoneType' object has no attribute 'split'.

What could that be?

Stacktrace:

Traceback (most recent call last):
  File "/Users/floris.vink/Documents/professional-services-data-validator/main.py", line 6, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/floris.vink/Documents/professional-services-data-validator/data_validation/__main__.py", line 582, in main
    validate(args)
  File "/Users/floris.vink/Documents/professional-services-data-validator/data_validation/__main__.py", line 560, in validate
    run(args)
  File "/Users/floris.vink/Documents/professional-services-data-validator/data_validation/__main__.py", line 523, in run
    config_managers = build_config_managers_from_args(args)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/floris.vink/Documents/professional-services-data-validator/data_validation/__main__.py", line 279, in build_config_managers_from_args
    config_manager = build_config_from_args(args, config_manager)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/floris.vink/Documents/professional-services-data-validator/data_validation/__main__.py", line 259, in build_config_from_args
    config_manager.build_column_configs(primary_keys)
  File "/Users/floris.vink/Documents/professional-services-data-validator/data_validation/config_manager.py", line 518, in build_column_configs
    source_table = self.get_source_ibis_calculated_table()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/floris.vink/Documents/professional-services-data-validator/data_validation/config_manager.py", line 360, in get_source_ibis_calculated_table
    table = self.get_source_ibis_table()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/floris.vink/Documents/professional-services-data-validator/data_validation/config_manager.py", line 342, in get_source_ibis_table
    self._source_ibis_table = clients.get_ibis_table(
                              ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/floris.vink/Documents/professional-services-data-validator/data_validation/clients.py", line 134, in get_ibis_table
    return client.table(table_name, database=database_name, schema=schema_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/__init__.py", line 519, in table
    sqla_table = self._get_sqla_table(name, database=database, schema=schema)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/__init__.py", line 422, in _get_sqla_table
    table = sa.Table(
            ^^^^^^^^^
  File "<string>", line 2, in __new__
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/sqlalchemy/util/deprecations.py", line 375, in warned
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/sqlalchemy/sql/schema.py", line 618, in __new__
    with util.safe_reraise():
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
    raise exception
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/sqlalchemy/sql/schema.py", line 614, in __new__
    table._init(name, metadata, *args, **kw)
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/sqlalchemy/sql/schema.py", line 689, in _init
    self._autoload(
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/sqlalchemy/sql/schema.py", line 724, in _autoload
    conn_insp.reflect_table(
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/sqlalchemy/engine/reflection.py", line 807, in reflect_table
    self._reflect_indexes(
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/sqlalchemy/engine/reflection.py", line 1035, in _reflect_indexes
    indexes = self.get_indexes(table_name, schema)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/sqlalchemy/engine/reflection.py", line 605, in get_indexes
    return self.dialect.get_indexes(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<string>", line 2, in get_indexes
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/sqlalchemy/engine/reflection.py", line 55, in cache
    ret = fn(self, con, *args, **kw)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/floris.vink/Documents/professional-services-data-validator/venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/base.py", line 4447, in get_indexes
    idx_keys = idx_key.split()
               ^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'split'
nehanene15 commented 1 year ago

This looks like a SQLAlchemy error related to instantiating your Redshift table. This may have something to do with this: https://github.com/sqlalchemy/sqlalchemy/issues/7730 where unique constraints cause issues.

It could be worth trying the redshift dialect here instead of the 'postgresql' we use today: https://github.com/GoogleCloudPlatform/professional-services-data-validator/blob/develop/third_party/ibis/ibis_redshift/__init__.py#L50

If that works, we can look into updating it in our develop branch.

piyushsarraf commented 1 year ago

@florisvink

Is the issue fixed after changing the dialect?

helensilva14 commented 1 year ago

Hi @florisvink and @piyushsarraf! Any updates on this issue?

piyushsarraf commented 1 year ago

Hi @helensilva14 I think changing the dialect to redshift would solve the issue

@florisvink can you please confirm?

florisvink commented 1 year ago

So instead of driver=f'postgresql+{driver}', I can try driver=f'redshift+{driver}', after installing https://pypi.org/project/sqlalchemy-redshift/ ?

helensilva14 commented 1 year ago

So instead of driver=f'postgresql+{driver}', I can try driver=f'redshift+{driver}', after installing https://pypi.org/project/sqlalchemy-redshift/ ?

@florisvink exactly, that's the code change we're asking for you to validate! Let us know if it works or if you need any help.