cumc-dbmi / cehrbert

CEHR-BERT: Incorporating temporal information from structured EHR data to improve prediction tasks
MIT License
31 stars 10 forks source link

Syntax error when creating training data #44

Closed schuemie closed 1 year ago

schuemie commented 1 year ago

After the event-visit linking PR (https://github.com/cumc-dbmi/cehr-bert/pull/26) was accepted, I'm now seeing this syntax error. I'm running on Windows:

spark-submit --driver-memory 64g --executor-cores 8 --num-executors 3 --executor-memory 8g spark_apps/generate_training_data.py -i d:/gpm_ccae/ -o d:/gpm_ccae/cehr-bert -tc condition_occurrence procedure_occurrence drug_exposure -d 1985-01-01 --is_new_patient_representation -iv
23/06/05 00:31:35 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Traceback (most recent call last):
  File "C:/Users/admin_mschuemi/Documents/git/cehr-bert/spark_apps/generate_training_data.py", line 7, in <module>
    from utils.spark_utils import *
  File "C:\Users\admin_mschuemi\Documents\git\cehr-bert\utils\spark_utils.py", line 785
    patient_event = patient_event.drop("_max")
                ^
SyntaxError: invalid syntax
log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Any thoughts?

ChaoPang commented 1 year ago

@schuemie do you still see this error after you pull from the master branch? A PR (#40) was merged recently to fix the syntax issue related to the event-visit linking PR (#26)

schuemie commented 1 year ago

Ah, sorry about that. Yes, the problem is solved now.