Open heschmidt04 opened 7 months ago
I encountered the same issue when exporting data to Ares. After some investigation, I discovered that using the TRY_CAST function in Snowflake was an effective solution. By implementing TRY_CAST across all relevant scripts in the Achilles::exportToAres function, I was able to resolve the issue and ensure smooth data exports to Ares.
However, it's important to note that while this solution works well for Snowflake, it may not be suitable for other databases that do not support the TRY_CAST function. This can lead to compatibility issues across different database systems. Therefore, to make the scripts work universally across all databases, a more complex fix might be necessary.
Example of the modification:
Original script :
CREATE TABLE AO_export_error AS
select t1.table_name as SERIES_NAME
, t1.stratum_1 as X_CALENDAR_MONTH
, round(1.0*t1.count_value/denom.count_value,5) as Y_RECORD_COUNT
from
(
select 'Visit occurrence' as table_name, CAST(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 220 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Condition occurrence' as table_name, CAST(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 420 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Death' as table_name, CAST(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 502 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Procedure occurrence' as table_name, CAST(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 620 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Drug exposure' as table_name, CAST(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 720 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Observation' as table_name, CAST(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 820 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Drug era' as table_name, CAST(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 920 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Condition era' as table_name, CAST(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 1020 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Observation period' as table_name, CAST(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 111 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Measurement' as table_name, CAST(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 1820 GROUP BY analysis_id, stratum_1, count_value
) t1
inner join
(select CAST(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 117 GROUP BY analysis_id, stratum_1, count_value) denom
on t1.stratum_1 = denom.stratum_1
ORDER BY SERIES_NAME, t1.stratum_1
Script with TRY_CAST:
CREATE TABLE AO_export_error AS
select t1.table_name as SERIES_NAME
, t1.stratum_1 as X_CALENDAR_MONTH
, round(1.0*t1.count_value/denom.count_value,5) as Y_RECORD_COUNT
from
(
select 'Visit occurrence' as table_name, **TRY_CAST**(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 220 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Condition occurrence' as table_name, **TRY_CAST**(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 420 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Death' as table_name, **TRY_CAST**(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 502 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Procedure occurrence' as table_name, **TRY_CAST**(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 620 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Drug exposure' as table_name, **TRY_CAST**(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 720 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Observation' as table_name, **TRY_CAST**(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 820 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Drug era' as table_name, **TRY_CAST**(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 920 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Condition era' as table_name, **TRY_CAST**(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 1020 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Observation period' as table_name, **TRY_CAST**(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 111 GROUP BY analysis_id, stratum_1, count_value
union all
select 'Measurement' as table_name, **TRY_CAST**(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 1820 GROUP BY analysis_id, stratum_1, count_value
) t1
inner join
(select CAST(stratum_1 as bigint) stratum_1, count_value from achilles_results where analysis_id = 117 GROUP BY analysis_id, stratum_1, count_value) denom
on t1.stratum_1 = denom.stratum_1
ORDER BY SERIES_NAME, t1.stratum_1
Note: The reason TRY_CAST works in this context is because, as mentioned by @heschmidt04, Snowflake attempts to auto-detect data types in tables. The TRY_CAST function includes built-in error handling, allowing the script to continue executing even if type conversion fails, unlike CAST which may halt execution on errors.
Describe the bug Ares Exporter fails on numeric type because temp table in memory has autosensing data type conversion issue. A search on github Ares issues didn't yield anything for me if this has happened before in export.
To Reproduce Steps to reproduce the behavior:
Expected behavior Ares Exporter to write files to output folder.
Is there a possible work around for this part?
Screenshots DBMS: snowflake
Error: net.snowflake.client.jdbc.SnowflakeSQLException: Numeric value '' is not recognized
Stackoverflow: https://stackoverflow.com/questions/70176093/numeric-value-is-not-recognized
The work around works because the datatype is auto sensed
-- AO_EXPORT_DENOM
-- AO_EXPORT_T1
Desktop (please complete the following information): R version: R version 4.3.1 (2023-06-16) Platform: aarch64-apple-darwin20
Attached base packages:
Other attached packages:
Additional context Files that were created from the process are these. No error.txt file in the errors directory.
Mar 15 17:06 dq-result_camel.json Mar 15 17:06 log_DqDashboard_Snowflake-MGB-OMOP.txt -- binary file??? Not sure why. Mar 15 17:06 dq-result.json Mar 15 17:06 datadensity-total.csv Mar 15 17:06 records-by-domain.csv