OHDSI / DbDiagnostics

Package to profile a database and execute data diagnostics based on individual analysis settings
https://ohdsi.github.io/DbDiagnostics/
Apache License 2.0
6 stars 5 forks source link

Error with uploading results to local database #21

Closed filipmaljkovic closed 1 month ago

filipmaljkovic commented 2 months ago

I ran the code from the section 2. Upload results to a local database and here's the output I'm getting:

...
> db_profile_results <- read.csv(paste0(resultsLocation,"/20240716/db_profile_results.csv"), stringsAsFactors = F, colClasses = c("STRATUM_1"="character"))
Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote,  :
  not all columns named in 'colClasses' exist
>
> # make sure the columns are read in as characters to facilitate dbDiagnostics execution
>
> db_profile_results$STRATUM_1 <- as.character(db_profile_results$STRATUM_1)
Error in `$<-.data.frame`(`*tmp*`, STRATUM_1, value = character(0)) :
  replacement has 0 rows, data has 21023
Calls: $<- -> $<-.data.frame
Execution halted
filipmaljkovic commented 1 month ago

I'm afraid I'm getting the same error message after updating DbDiagnostics and rerunning executeDbProfile and then rerunning the "upload" code

clairblacketer commented 1 month ago

hi @filipmaljkovic usually when I get an error like this it is when I am running executeDbDiagnostics and I have duplicates in my dbProfile extract. There is a weird bug where if you set appendAchilles = FALSE but then the writeTo schema as the Achilles schema it will pull the Achilles results twice. Is it possible there are duplicates in your file?

filipmaljkovic commented 1 month ago

If you mean db_profile_results_dist.csv, there don't appear to be duplicates there.

Looking at db_profile_results.csv, there don't appear to be duplicates there either.

cat db_profile_results.csv | sort | uniq -c | sort -nr returns all "ones", meaning that there are no exact duplicate rows.

I do however have "NA" values for several strata. Dunno if that's expected or not, i.e. whether that would account for an error like this. Prolly not, since the breakage appears on stratum 1, which does appear to have values (except for the first data row in the csv -- but even if I remove it, the error persists).