OHDSI / WebAPI

OHDSI WebAPI contains all OHDSI services that can be called from OHDSI applications
Apache License 2.0
126 stars 156 forks source link

Snowflake SQL IN where clause contains maximum number of expressions allowed #2284

Open davidhcar opened 1 year ago

davidhcar commented 1 year ago

Expected behavior

Return result set for the following SQL query,

select concept_id, record_count, descendant_record_count, person_count, descendant_person_count
from _results.achilles_result_concept_count where concept_id IN(around 30000 of concept_ids)

Actual behavior

nested exception is net.snowflake.client.jdbc.SnowflakeSQLException: SQL compilation error: error line 3 at position 20
maximum number of expressions in a list exceeded, expected at most 16,384, got 30,000

Total number of records under tab 'Included Source Codes' has 69,523 records

Steps to reproduce behavior

Under Concept Sets --> tab 'Included Source Codes' when this list exceeds 16,384 the above SQL exception from Snowflake thrown.

Note: The tab Included Concept Sets record is 16,384. It looks like Atlas restricting this list to this max but missed on the 'Included Source Codes' tab ? It looks like it but not sure, wanted to include this observation.

The recommendation from snowflake is, rewrite this query with JOIN statement, https://community.snowflake.com/s/article/maxi-expressions-exceeded

chrisknoll commented 1 year ago

can't rewrite as a join because we're getting a list of conceptIDs to lookup, so one way or another we're going to be needing an IN, we just need to partition.

I thought we already did partition out the set of conceptIDs we fetch from the record count....but I could be remembering something else. Therefore, we need a dialect-specific maximum that we can partition the IN clause and perform multiple queries for each partition.

davidhcar commented 1 year ago

Thank you @chrisknoll. Looking for ward to this partition work..