rfhb / ctrdata

Aggregate and analyse information on clinical trials from public registers
https://rfhb.github.io/ctrdata/
Other
41 stars 5 forks source link

Erroneous entries in register fields (e.g. date) not flagged with dbGetFieldsIntoDf() #20

Closed rfhb closed 3 years ago

rfhb commented 3 years ago

Function dbGetFIeldsIntoDf() includes typing of certain fields, such as "n_date_of_ethics_committee_opinion" as calendar date. However, when a register record has a non-conforming entry in such a field (e.g., "2020-88-99") the user is not made aware that the typing in this function just replaces this field with NA, silently. Reported by @florianlasch .

rfhb commented 3 years ago

Fixed with commit a534267.

Example:

testdf <- data.frame(
  "_id" = c("2008-123456-78", "2008-123456-79", "2008-123456-80", "2008-123456-81"),
  "n_date_of_ethics_committee_opinion" = c("2021-01-01", "900-14-99", "", "anno X"),
  check.names = FALSE,
  stringsAsFactors = FALSE
)
testdf <- ctrdata:::typeField(dfi = testdf)

now generates this user message:

Unexpected string(s) in column 'n_date_of_ethics_committee_opinion':
900-14-99 / anno X, for _id(s)
2008-123456-79 / 2008-123456-81