medizininformatik-initiative / INTERPOLAR

CDS Tool Chain Repository
https://medizininformatik-initiative.github.io/INTERPOLAR/
5 stars 0 forks source link

Fehler in StartDB2Frontend.R wegen leerer fall_fe Tabelle #416

Open medicreitzner opened 1 week ago

medicreitzner commented 1 week ago

Hallo zusammen,

ich habe erhalte folgenden Fehler beim Ausführen des Frontend Scriptes R-db2frontend/StartDB2Frontend.R:

Try to connect with: 
   dbname=cds_hub_db
   host=cds_hub
   port=5432
   user=db2frontend_user
   password=***
   schema=db2frontend_out
Error: Failed to prepare query : ERROR:  column "fall_typ_id" does not exist
LINE 1: ...nstrument, redcap_repeat_instance, patient_id_fk, fall_typ_i...
                                                             ^
HINT:  Perhaps you meant to reference the column "v_fall.fall_fe_id" or the column "v_fall.fall_pat_id".
Execution halted

Wir verwenden den Server https://mii-agiop-3p.life.uni-leipzig.de zum testen.

Unsere cds2db_config.toml sieht folgendermaßen aus:

#############################################
# Project Definition for cds2db ETL process #
#############################################

[general]

###
# VERBOSE LEVELS
###
# VL_00_NOTHING             = 0
# VL_10_MAIN_INFO           = 1
# VL_20_OUTER_SCRIPTS       = 2
# VL_30_INNER_SCRIPTS       = 3
# VL_40_INNER_SCRIPTS_INFOS = 4
# VL_50_TABLES              = 5
# VL_60_ALL_TABLES          = 6
# VL_70_DOWNLOAD            = 7
# VL_80_COI_PROCESSING      = 8
# VL_90_FHIR_RESPONSE       = 9
VERBOSE               = 10 # Retrieval / analysis verbose level

# Set true if every sourcing of the 00_Main file should create result directories in the
# otuputGlobal and otuputLocal directores with a timestamp suffix. If false then every run will
# write in the same subdirectory without a timestamp.
# The default behaviour should write in subdirectories with only the project name (false) or in or
# in a subdirectorywith the project name and the suffix '_timestamp'.
USE_TIMESTAMP_AS_RESULT_DIR_SUFFIX = false

# Maximum of saved cds2db result folders in outputLocal and outputGlobal
MAX_DIR_COUNT = 5

[retrieve]

# Number of minutes to wait between inserting the data into the RAW tables and starting the
# typing process.
# This time must be greater than the time for the cron jobs of the copy functions in the database
# (defined in database init script 30, at the bottom).
# Default is 1 minute.
DELAY_MINUTES_BETWEEN_RAW_INSERT_AND_START_TYPING = 1

[retrieve.fhir_server]

FHIR_SERVER_ENDPOINT = "https://mii-agiop-3p.life.uni-leipzig.de/blaze"
# FHIR_SERVER_ENDPOINT = "http://127.0.0.1:8080/blaze"
FHIR_SERVER_USER = ""
FHIR_SERVER_PASS = ""

FHIR_TOKEN                  = ""
FHIR_TOKEN_REFRESH_URL      = ""
FHIR_TOKEN_REFRESH_USER     = ""
FHIR_TOKEN_REFRESH_PASSWORD = ""

MAX_CHARACTER_LENGTH_FOR_GET_REQUESTS         = 2083 # max character length for GET-Requests [ehemals MAX_LEN]
MAX_CHARACTER_LENGTH_FOR_GET_REQUESTS_RESERVE = 300  # reserve 300 characters for further parameters, whose length cannot estimated [ehemals RES_LEN]

MIN_PATIENT_AGE                    = 18           # defined Age threshold in POLAR basic population
MIN_BIRTHDAY                       = "1910-01-01" # lowest possible birthday
DEFAULT_EMPTY_BIRTHDAY_PATTERN     = ""           # pattern to identify the default birth date if the real date is missing

MAX_ENCOUNTER_BUNDLES              = inf # maximum number of Encounter bundles that will be loaded from the server ('inf' = all)
COUNT_PER_BUNDLE                   = 100 # as default or use server default ? The number of results that are mapped per FHIR request on a response page [ehemals COUNT]
SORT                               = ""  # some Fhir server require a sort parameter per default
BUNDLES_AT_ONCE                    = 10  # bundles per request - number of bundles which are processed in a FHIR request
IDS_AT_ONCE                        = 100 # patient / medication ids combined in one request (GET or POST) - specifies the number of ID's, which are summarized in a request (after download of Encounters)

# polar_download_by_ids_and_crack_parallel() and polar_download_and_crack_parallel()
MAX_CORES                          = 0 # paralellization, number of cores used, 0: unlimited
FORCE_DOWNLOAD_AND_CRACK_PARALLEL  = false # if TRUE: force download and crack parallel with basic method, avoid polar_download_by_ids_and_crack_parallel()
NEXT_LINK_ENCODE                   = false # encode special chars in next_link (see polar_download_and_crack_parallel())
DELAY_REQ                          = 2 # if request fails, wait X seconds, the delay X=DELAY_REQ^(1..n) in seconds

[retrieve.fhir_search]

# Number of days to look into the past when searching for unfinished Encounters based on time.now() (resp. Sys.time()
# in R) if the table for Encounters in the database is still completely empty (initial start)
INITIAL_ENCOUNTER_START_DATE_IN_PAST_DAYS = 1800

# Number of days to look into the past when searching for unfinished Encounters based on time.now() (resp. Sys.time()
# in R) if the program is started regularly after the initial start. This number should always be at least large enough
# to include the entire period since the last start (preferably at least 1 day more).
USUAL_ENCOUNTER_START_DATE_IN_PAST_DAYS = 2

[retrieve.debug]
###################
# FOR DEBUG ONLY: #
###################

# If DEBUG_PERIOD_START is enabled then the start date is not time.now (resp. Sys.time() in R) but this date
#DEBUG_PERIOD_START = "2020-07-30"

# If DEBUG_PERIOD_END is enabled then the both values INITIAL_ENCOUNTER_START_DATE_IN_PAST_DAYS and
# USUAL_ENCOUNTER_START_DATE_IN_PAST_DAYS will be ignored and only Encounters in the range of DEBUG_PERIOD_START
# and DEBUG_PERIOD_END will be loaded. DEBUG_PERIOD_END will be ignored if DEBUG_PERIOD_START is not set.
#DEBUG_PERIOD_END = "2020-07-30"

[retrieve.patient_id_filtering]

# Enable the following parameter, if the relevant patient IDs (PID) are provided by an external
# system and they are stored in a text file (1 PID pr line). This deactivates the search for
# relevant PIDs via encounters (see ENCOUNTER_FILTER_PATTERNS).
#PATH_TO_PID_LIST_FILE = "./R-cds2db/source_PIDs.txt"

# Use this defintion to set the filters to identify all Encounters of the relevant wards. This is
# only necessary if the relevant Patient IDs should be derived from this Encounters and are not
# provided in any other way.
# Every entry defines a single condition. An Encounter will be accepted, if at least one condition is
# fulfilled (OR connected). A condition defines which expressions of the Encounter should have which
# value. The values can be defined as grep patterns. Multilpe subconditions can be connected by '+'
# (logcal AND). All Patterns are Case INSENSITIVE!
# A valid patterns variable name must start with 'ENCOUNTER_FILTER_PATTERN'.

ENCOUNTER_FILTER_PATTERN_1 = [
  "ward_name = '.*'"
]

# ENCOUNTER_FILTER_PATTERN_1 = [
#   "ward_name = 'Station 1'",
#   "serviceType/coding/code = '1600' + class/code = 'station|IMP|inpatient|emer|ACUTE|NONAC'",
#   "serviceType/coding/display = 'Unfallchirurgie' + class/code = 'station|IMP|inpatient|emer|ACUTE|NONAC'"
# ]

# ENCOUNTER_FILTER_PATTERN_2 = [
#   "ward_name = 'Station 2'",
#   "serviceType/coding/code = '0500' + class/code = 'station|IMP|inpatient|emer|ACUTE|NONAC'"
# ]

# ENCOUNTER_FILTER_PATTERN_3 = [
#   "ward_name = 'Station 3'",
#   "serviceType/coding/code = '1600' + class/code = 'station|IMP|inpatient|emer|ACUTE|NONAC'",
#   "serviceType/coding/display = 'Unfallchirurgie' + class/code = 'station|IMP|inpatient|emer|ACUTE|NONAC'"
# ]

[database]

PATH_TO_DB_CONFIG_TOML = "./cds_hub_db_config.toml"

In patient_fe sind Daten, fall_fe ist leer.

Wisst ihr, wie ich das beheben kann?

astruebi commented 1 week ago

@medicreitzner Wir vermuten, dass dieser Fehler bereits behoben ist. Wir erstellen heute ein neues Release aus dem develop-Branch.

medicreitzner commented 1 week ago

Alles klar. Danke sehr!

Die Konfiguration, die ich mitgeschickt habe, sollte nicht das Problem sein oder? Ist das Ausführen der Scripte mit dem Fhir Testserver https://mii-agiop-3p.life.uni-leipzig.de möglich?

astruebi commented 1 week ago
ENCOUNTER_FILTER_PATTERN_1 = [
  "ward_name = '.*'"
]

Beim ward_name soll kein Pattern drin stehen, sondern genau der String, der am Ende der pipeline im Frontend für diese Station angezeigt werden soll, also z.B. "Interpolar Station 1". Die Patterns stehen nur in den eigentlichen Filtern, die in diesem Fall gar nicht vorhanden sind.

Eine solche Einstellung ohne Filter ordnet alle Fälle dieser einen "logischen Station" zu.

medicreitzner commented 1 week ago

Wir verwenden jetzt den neusten Commit auf branch release mit komplett leerer Datenbank. Leider erhalten wir keine Daten von dem mii fhir server. Sektion retrieve.fhir_search ist auch aus der toml Datei cds2db_config.toml geflogen. Leider bei Script 1: Es werden keine Patient, Observation & Encounter Daten gefunden.

medicreitzner commented 3 days ago

Guten Tag zusammen,

vielen Dank für die Session kürzlich! Filter & API funktionieren jetzt. Ich baue die Images jetzt auf einer anderen VM.

Beim letztten Skript erhalte ich allerdings noch folgendes:

Configuration:
--------------
MAX_DIR_COUNT = 5
USE_TIMESTAMP_AS_RESULT_DIR_SUFFIX = FALSE
VERBOSE = 10
PATH_TO_DB_CONFIG_TOML = ./cds_hub_db_config.toml
REDCAP_TOKEN = C9C664116A286D55EBE15C508635B23E
REDCAP_URL = http://***.de:8082/redcap/api/

START
[TIME] 1728030585 2024-10-04 10:29:44 
Run Import from Database to Frontend: RUNNING ...
Try to connect with: 
   dbname=cds_hub_db
   host=cds_hub
   port=5432
   user=db2frontend_user
   password=***
   schema=db2frontend_out
SELECT record_id, patient_fe_id, pat_id, pat_cis_pid, pat_name, pat_vorname, pat_gebdat, pat_aktuell_alter, pat_geschlecht, patient_complete FROM v_patient 
SELECT record_id, redcap_repeat_instrument, redcap_repeat_instance, patient_id_fk, fall_fe_id, fall_pat_id, fall_id, fall_studienphase, fall_station, fall_aufn_dat, fall_aufn_diag, fall_gewicht_aktuell, fall_gewicht_aktl_einheit, fall_groesse, fall_groesse_einheit, fall_status, fall_ent_dat, fall_complete FROM v_fall 
Run Import from Database to Frontend: OK
END                                                                                                     
[1] TRUE

START
[TIME] 1728030586 2024-10-04 10:29:46 
Run Import from Frontend to Database: RUNNING ...
Try to connect with: 
   dbname=cds_hub_db
   host=cds_hub
   port=5432
   user=db2frontend_user
   password=***
   schema=db2frontend_in
[1] "Inserted in patient_fe, 5 rows (took 0.0101230144500732 seconds)"
Run Import from Frontend to Database: Error : Failed to initialise COPY : ERROR:  column "fall_op_geplant" of relation "fall_fe" does not exist

END

Tabelle cds_hub_db.fall_fe enthält die Spalte nicht.

Ist das ein bekannter Bug oder liegt das an meinem Setup?

Vielen Dank