Open mibarannikov opened 1 week ago
@mibarannikov Hello! I'm here to help you with any bugs, questions, or contributions.
The issue you're encountering with the Shillelagh adapter not returning column values greater than 64 in JSON data sources might be related to specific configurations or limitations in the adapter's handling of JSON data.
Column Limit Configuration:
The Shillelagh adapter has a limit configuration that might be affecting the number of columns processed. The get_data
method in the SupersetShillelaghAdapter
class includes a limit configuration that fetches the app_limit
from the application configuration SUPERSET_META_DB_LIMIT
:
def get_data(
self,
bounds: dict[str, Filter],
order: list[tuple[str, RequestedOrder]],
limit: int | None = None,
offset: int | None = None,
**kwargs: Any,
) -> Iterator[Row]:
app_limit: int | None = current_app.config["SUPERSET_META_DB_LIMIT"]
if limit is None:
limit = app_limit
elif app_limit is not None:
limit = min(limit, app_limit)
query = self._build_sql(bounds, order, limit, offset)
with self.engine_context() as engine:
connection = engine.connect()
rows = connection.execute(query)
for i, row in enumerate(rows):
data = dict(zip(self.columns, row))
data["rowid"] = data[self._rowid] if self._rowid else i
yield data
Ensure that the SUPERSET_META_DB_LIMIT
is set to a value that accommodates the number of columns you need.
Column Inspection:
The _set_columns
method in the SupersetShillelaghAdapter
class inspects the table to get its columns during initialization. This method might be missing columns if there is a limit or issue during the inspection process:
def _set_columns(self) -> None:
from superset.models.core import Database
database = (
db.session.query(Database).filter_by(database_name=self.database).first()
)
if database is None:
raise ProgrammingError(f"Database not found: {self.database}")
self._allow_dml = database.allow_dml
table = sql_parse.Table(self.table, self.schema, self.catalog)
security_manager.raise_for_access(database=database, table=table)
self.engine_context = partial(
database.get_sqla_engine,
catalog=self.catalog,
schema=self.schema,
)
metadata = MetaData()
with self.engine_context() as engine:
try:
self._table = Table(
self.table,
metadata,
schema=self.schema,
autoload=True,
autoload_with=engine,
)
except NoSuchTableError as ex:
raise ProgrammingError(f"Table does not exist: {self.table}") from ex
primary_keys = [
column for column in list(self._table.primary_key) if column.primary_key
]
if len(primary_keys) == 1 and primary_keys[0].type.python_type == int:
self._rowid = primary_keys[0].name
self.columns = {
column.name: self.get_field(column.type.python_type)
for column in self._table.c
}
Logging and Debugging:
Enable detailed logging to identify any issues during the data retrieval process. Adjust the logging level to DEBUG
in the superset/config.py
file:
LOG_LEVEL = logging.DEBUG
TIME_ROTATE_LOG_LEVEL = logging.DEBUG
Check for Known Issues: There might be known issues or bug reports related to handling large JSON objects or arrays in the Shillelagh adapter. Ensure you have the latest version of the adapter and check the repository for any related issues.
By addressing these potential causes, you should be able to resolve the issue with column values greater than 64 being null in JSON data sources. If the problem persists, further investigation into the specific configurations and limitations of the Shillelagh adapter may be necessary.
To continue talking to Dosu, mention @dosu.
@betodealmeida is definitely the one to talk to here!
Thanks, updating the library apsw helped.
Bug description
All values are null in json for fields greater than 64. I updated superset 3.1 -> 4.0.2. I updated shillagh to 1.2.27 (apsw 3.42.0.1). I created another json array source. But nothing changed. All json fields greater than 64 are null.
Generation was performed by the activity process. var test = []; for (var l = 0; l < 2; l++) { var o = {};
} execution.setVariable("test", JSON.stringify(test));
How to reproduce the bug
Screenshots/recordings
Superset version
4.0.2
Python version
3.9
Node version
16
Browser
Chrome
Additional context
logs from superset 2024-08-29T20:51:33+03:00 2024-08-29 17:51:33,485:DEBUG:shillelagh.adapters.api.generic_json:{'col_0': 0, 'col_1': 1, 'col_2': 2, 'col_3': 3, 'col_4': 4, 'col_5': 5, 'col_6': 6, 'col_7': 7, 'col_8': 8, 'col_9': 9, 'col_10': 10, 'col_11': 11, 'col_12': 12, 'col_13': 13, 'col_14': 14, 'col_15': 15, 'col_16': 16, 'col_17': 17, 'col_18': 18, 'col_19': 19, 'col_20': 20, 'col_21': 21, 'col_22': 22, 'col_23': 23, 'col_24': 24, 'col_25': 25, 'col_26': 26, 'col_27': 27, 'col_28': 28, 'col_29': 29, 'col_30': 30, 'col_31': 31, 'col_32': 32, 'col_33': 33, 'col_34': 34, 'col_35': 35, 'col_36': 36, 'col_37': 37, 'col_38': 38, 'col_39': 39, 'col_40': 40, 'col_41': 41, 'col_42': 42, 'col_43': 43, 'col_44': 44, 'col_45': 45, 'col_46': 46, 'col_47': 47, 'col_48': 48, 'col_49': 49, 'col_50': 50, 'col_51': 51, 'col_52': 52, 'col_53': 53, 'col_54': 54, 'col_55': 55, 'col_56': 56, 'col_57': 57, 'col_58': 58, 'col_59': 59, 'col_60': 60, 'col_61': 61, 'col_62': 62, 'col_63': 63, 'col_64': 64, 'col_65': 65, 'col_66': 66, 'col_67': 67, 'col_68': 68, 'col_69': 69, 'col_70': 70, 'col_71': 71, 'col_72': 72, 'col_73': 73, 'col_74': 74, 'col_75': 75, 'col_76': 76, 'col_77': 77, 'col_78': 78, 'col_79': 79, 'col_80': 80, 'col_81': 81, 'col_82': 82, 'col_83': 83, 'col_84': 84, 'col_85': 85, 'col_86': 86, 'col_87': 87, 'col_88': 88, 'col_89': 89, 'col_90': 90, 'col_91': 91, 'col_92': 92, 'col_93': 93, 'col_94': 94, 'col_95': 95, 'col_96': 96, 'col_97': 97, 'col_98': 98, 'col_99': 99, 'rowid': 0} 2024-08-29T20:51:33+03:00 2024-08-29 17:51:33,485:DEBUG:shillelagh.adapters.api.generic_json:{'col_0': 100, 'col_1': 101, 'col_2': 102, 'col_3': 103, 'col_4': 104, 'col_5': 105, 'col_6': 106, 'col_7': 107, 'col_8': 108, 'col_9': 109, 'col_10': 110, 'col_11': 111, 'col_12': 112, 'col_13': 113, 'col_14': 114, 'col_15': 115, 'col_16': 116, 'col_17': 117, 'col_18': 118, 'col_19': 119, 'col_20': 120, 'col_21': 121, 'col_22': 122, 'col_23': 123, 'col_24': 124, 'col_25': 125, 'col_26': 126, 'col_27': 127, 'col_28': 128, 'col_29': 129, 'col_30': 130, 'col_31': 131, 'col_32': 132, 'col_33': 133, 'col_34': 134, 'col_35': 135, 'col_36': 136, 'col_37': 137, 'col_38': 138, 'col_39': 139, 'col_40': 140, 'col_41': 141, 'col_42': 142, 'col_43': 143, 'col_44': 144, 'col_45': 145, 'col_46': 146, 'col_47': 147, 'col_48': 148, 'col_49': 149, 'col_50': 150, 'col_51': 151, 'col_52': 152, 'col_53': 153, 'col_54': 154, 'col_55': 155, 'col_56': 156, 'col_57': 157, 'col_58': 158, 'col_59': 159, 'col_60': 160, 'col_61': 161, 'col_62': 162, 'col_63': 163, 'col_64': 164, 'col_65': 165, 'col_66': 166, 'col_67': 167, 'col_68': 168, 'col_69': 169, 'col_70': 170, 'col_71': 171, 'col_72': 172, 'col_73': 173, 'col_74': 174, 'col_75': 175, 'col_76': 176, 'col_77': 177, 'col_78': 178, 'col_79': 179, 'col_80': 180, 'col_81': 181, 'col_82': 182, 'col_83': 183, 'col_84': 184, 'col_85': 185, 'col_86': 186, 'col_87': 187, 'col_88': 188, 'col_89': 189, 'col_90': 190, 'col_91': 191, 'col_92': 192, 'col_93': 193, 'col_94': 194, 'col_95': 195, 'col_96': 196, 'col_97': 197, 'col_98': 198, 'col_99': 199, 'rowid': 1} 2024-08-29T20:51:33+03:00 2024-08-29 17:51:33,490:DEBUG:requests_cache.policy.actions:Cache directives from request headers: CacheDirectives() 2024-08-29T20:51:33+03:00 2024-08-29 17:51:33,490:DEBUG:requests_cache.policy.actions:Pre-read cache checks: Passed 2024-08-29T20:51:33+03:00 2024-08-29 17:51:33,490:DEBUG:requests_cache.policy.actions:Post-read cache actions: CacheActions(expire_after=180.0) 2024-08-29T20:51:33+03:00 2024-08-29 17:51:33,490:DEBUG:shillelagh.adapters.api.generic_json:{'col_0': 0, 'col_1': 1, 'col_2': 2, 'col_3': 3, 'col_4': 4, 'col_5': 5, 'col_6': 6, 'col_7': 7, 'col_8': 8, 'col_9': 9, 'col_10': 10, 'col_11': 11, 'col_12': 12, 'col_13': 13, 'col_14': 14, 'col_15': 15, 'col_16': 16, 'col_17': 17, 'col_18': 18, 'col_19': 19, 'col_20': 20, 'col_21': 21, 'col_22': 22, 'col_23': 23, 'col_24': 24, 'col_25': 25, 'col_26': 26, 'col_27': 27, 'col_28': 28, 'col_29': 29, 'col_30': 30, 'col_31': 31, 'col_32': 32, 'col_33': 33, 'col_34': 34, 'col_35': 35, 'col_36': 36, 'col_37': 37, 'col_38': 38, 'col_39': 39, 'col_40': 40, 'col_41': 41, 'col_42': 42, 'col_43': 43, 'col_44': 44, 'col_45': 45, 'col_46': 46, 'col_47': 47, 'col_48': 48, 'col_49': 49, 'col_50': 50, 'col_51': 51, 'col_52': 52, 'col_53': 53, 'col_54': 54, 'col_55': 55, 'col_56': 56, 'col_57': 57, 'col_58': 58, 'col_59': 59, 'col_60': 60, 'col_61': 61, 'col_62': 62, 'col_63': 63, 'rowid': 0} 2024-08-29T20:51:33+03:00 2024-08-29 17:51:33,491:DEBUG:root:Query 260: Handling cursor 2024-08-29T20:51:33+03:00 2024-08-29 17:51:33,495:DEBUG:shillelagh.adapters.api.generic_json:{'col_0': 100, 'col_1': 101, 'col_2': 102, 'col_3': 103, 'col_4': 104, 'col_5': 105, 'col_6': 106, 'col_7': 107, 'col_8': 108, 'col_9': 109, 'col_10': 110, 'col_11': 111, 'col_12': 112, 'col_13': 113, 'col_14': 114, 'col_15': 115, 'col_16': 116, 'col_17': 117, 'col_18': 118, 'col_19': 119, 'col_20': 120, 'col_21': 121, 'col_22': 122, 'col_23': 123, 'col_24': 124, 'col_25': 125, 'col_26': 126, 'col_27': 127, 'col_28': 128, 'col_29': 129, 'col_30': 130, 'col_31': 131, 'col_32': 132, 'col_33': 133, 'col_34': 134, 'col_35': 135, 'col_36': 136, 'col_37': 137, 'col_38': 138, 'col_39': 139, 'col_40': 140, 'col_41': 141, 'col_42': 142, 'col_43': 143, 'col_44': 144, 'col_45': 145, 'col_46': 146, 'col_47': 147, 'col_48': 148, 'col_49': 149, 'col_50': 150, 'col_51': 151, 'col_52': 152, 'col_53': 153, 'col_54': 154, 'col_55': 155, 'col_56': 156, 'col_57': 157, 'col_58': 158, 'col_59': 159, 'col_60': 160, 'col_61': 161, 'col_62': 162, 'col_63': 163, 'rowid': 1} 2024-08-29T20:51:33+03:00 Query 260: Storing results in results backend, key: 8c5eba8a-427b-4cf3-b9a4-b735de18f0aa
Checklist