National-COVID-Cohort-Collaborative / Phenotype_Data_Acquisition

The repository for code and documentation produced by the N3C Phenotype and Data Acquisition workstream
60 stars 35 forks source link

R Exporter failing during Observation table write #228

Closed nzurawski closed 1 year ago

nzurawski commented 1 year ago

Lately I cannot get the R exporter to work with our OMOP CDM. It seems to be failing during the writing of the Observation.CSV file. I am getting an error file but I'm not sure how to inertepret it. I've attached it here. No error shows up on the console itself. It just goes from running normally to suddenly the R session crashing. I am using useAndromeda = TRUE because without it I was running into memory issues. Any ideas how I might troubleshoot this? hs_err_pid35533.log

empfff commented 1 year ago

I cannot say for sure, but despite your using Andromeda, you may still be running into a memory issue. Is your Observation table your biggest table? I have gotten reports from a couple other sites that the R Exporter is just no longer able to handle the extracts this late in the pandemic, when there are many more patients (and data) than at the beginning.

If you suspect it might be a memory issue: is there any possibility you could try to Python exporter? I hate to ask you switch your method this late in the game, but I have not had reports of memory issues there, and it may be easier for us to troubleshoot. Let us know, we'll figure something out!

nzurawski commented 1 year ago

I cannot say for sure, but despite your using Andromeda, you may still be running into a memory issue. Is your Observation table your biggest table? I have gotten reports from a couple other sites that the R Exporter is just no longer able to handle the extracts this late in the pandemic, when there are many more patients (and data) than at the beginning.

If you suspect it might be a memory issue: is there any possibility you could try to Python exporter? I hate to ask you switch your method this late in the game, but I have not had reports of memory issues there, and it may be easier for us to troubleshoot. Let us know, we'll figure something out!

Hi there. Thanks for the response. I don't believe we have anything setup here already to run Python scripts. I'm trying to do it with the R Studio IDE using Reticulate but I'm not very familiar with Python. I edited the example ini file to right parameters for our site and renamed it to config.ini and ran the provided command (edited to make site specific): python db_exp.py --config config.ini --database mssql --phenotype N3C_phenotype_omop_mssql.sql --extract N3C_extract_omop_mssql.sql --zip

But the R console just says SyntaxError: invalid syntax (, line 1)

Am I doing something silly here? pythonerror

nzurawski commented 1 year ago

Just updating to say I did get the exporter running through Python so no need to respond. I'll close the issue if it works otherwise 'll update. Thanks for the Python suggestion and for letting us know we weren't alone in running into memory issues with the R exporter.

empfff commented 1 year ago

Awesome! Sorry I didn't get back to you on your previous question in time, but seems like you worked it out. Glad to hear it's working--this is a good thing for us to keep in mind for the future, that the R Exporter may just not be built for ginormous exports. Thanks!