Closed EthanSteinberg closed 2 weeks ago
I just ran into this. For me the offending operation was here:
backup_value = (
pl.when((source_concept_id == 0) & (concept_id_value != 0))
.then(
# Source concept 0 indicates we need a backup value since it's not captured by the source
"SOURCE_CODE/"
+ pl.col(concept_id_field.replace("_concept_id", "_source_value"))
)
.otherwise(
# Should be captured by the source concept id, so just map the value to a string.
concept_id_value.map_dict(concept_name_map)
)
)
By removing the otherwise
I could sink to parquet, likewise if I changed the otherwise to have only pl.lit(1)
Collecting and then writing to parquet works fine.
Fixed
I have not been able to reproduce, but I have received two reports that the streaming engine in Polars is failing for parquet files when using the OMOP ETL.
The precise error message is the following:
This needs to be fixed, if only by switching to the workaround collect().write_parquet().