Closed leehart closed 5 years ago
This is despite imposing a limit on the query rows:
let queryAPIargs = {
database: this.config.dataset,
table: this.config.tablesById[table].id,
columns,
query: this.getDefinedQuery(query, table),
orderBy: order,
groupBy,
start: 0,
stop: 1000,
transpose: false
};
...which itself causes problems related to incomplete data (#319).
I'm thinking of removing this limit (to solve #319, etc.) and then trying to find a different way to prevent this memory issue, such as only allowing options that produce fewer than (7096 samples x 29 countries = 205,784 data points).
In theory, JavaScript arrays and objects should be able to remember 4,294,967,295 entries, so I suspect there are some other hogs in the forest.
I'm thinking we could handle this in the same way that we currently handle the download limit breach, in DataTableWithActions, which uses a utility function DataDownloader, which calls an onLimitBreach
function, and displays a message advising the user (to use a filter, etc.).
So we can be consistent with that approach/precedent.
(DataDownloader currently has a constant set to MAX_DOWNLOAD_DATA_POINTS = 100000
, which isn't such a bad place to start, considering I know 205,784 data points causes problems in the pivot.)
Limitations/simplicity of the current API.rowsCount
means we can't (yet) get a proper rowsCount
for group by
queries that include a count(*)
, because this sort of query:
SELECT count(*) AS "count","sample_id","country_id" FROM "pf_samples" GROUP BY "sample_id","country_id" ORDER BY "sample_id" asc,"country_id" asc
simply becomes this sort of rowsCount
query
SELECT count(*) AS "TotalRecordCount" FROM "pf_samples" GROUP BY "sample_id","country_id"
which causes the API to return the first grouped value, which isn't a count of rows.
Or if we tweak API.rowsCount
to strip out group by
, then we get the just as useless
SELECT count(*) AS "TotalRecordCount" FROM "pf_samples"
For monetdb, it looks like a common way to get a row count of group by
queries would be a subquery/sub-select, but I don't think that's straightforward in Panoptes (yet). (I think it was built on a very simplistic/flat data and query model/assumption.)
On a side note, it turns out the the unresponsiveness isn't due to the query itself and the volume of data per se, but only the process of displaying/rendering that data as a mui Table
.
So another approach is to run the normal group by
query, and then count the data points in JavaScript, then decide whether to show the results (if it's safe to do so) or show a message instead.
Verified on staging
Samples table Pivot table Column: Sample ID Row: Sample ID See that the browser does not respond for about 1 minute. When the table eventually appears, see that trying to scroll the table, or choose other options, etc. causes more unresponsiveness. Some times the browser's automatic debugger will be triggered:
Paused before potential out-of-memory crash
.