CellProfiler / CellProfiler-Analyst

Open-source software for exploring and analyzing large, high-dimensional image-derived data.
http://cellprofileranalyst.org
Other
144 stars 73 forks source link

Make check_tables work with complex databases #304

Closed DavidStirling closed 3 years ago

DavidStirling commented 3 years ago

We had a problem where check_tables functionality failed on SQLite files which had more than 500 measurement columns. SQLite appears to have a limit of 1000 comparators per operation, and since we check everything against both NULL and empty strings you could easily exceed that limit.

To resolve this for now, I've made it so that table checking breaks up the SQL commands into chunks of 400 columns. We first make the checked table using an initial set of rules, then go through and delete columns which fail additional batches.

The main downside to this approach is that deleting rows seems to lock the database, so we'll have to write the checked tables properly instead of keeping them as temporary tables.