dexter-psychometrics / dexter

Management, assessment, and psychometric analysis of data from educational and psychological tests
GNU Lesser General Public License v3.0
8 stars 5 forks source link

Error: Foreign key constrain failed. #8

Closed WrdeVries closed 1 year ago

WrdeVries commented 1 year ago

Hi,

When importing a booklet I get the following error:


> dexterdb = start_new_project(rules,"tests.db")
> add_booklet(dexterdb, booklet, "bigTest")
Error: FOREIGN KEY constraint failed

I have checked whether all items named in the booklet columnnames are available in the rules. This seems the case. If needed I can provide a minimal example.

jessekps commented 1 year ago

Excuse the slow response, christmas holidays. If possible I would like to see the rules and booklet objects. Maybe you can add an Rdata file to the issue?

WrdeVries commented 1 year ago

rdata files of rules and booklet.zip

Here are the Rdata files of the rules and booklet objects I am using to create the dexter db. Thanks you for your time.

jessekps commented 1 year ago

When I try your example I get a slightly different result. So either this dataset is somehow differtent from your first problem or possibly you have an older dexter version.

load("rules.rdata")
load("booklet.rdata")

db = start_new_project(hrules,':memory:')

add_booklet(db, hbooklet, "bigTest")

the result is a warning:

The following responses are not in your rules (showing first 30):

                          item_id response
 d944e362bd27e66ab81ab60ffdb8889d       NA
 45cf374e1bb95ca1a549b50fea5808e5       NA
 e1f8562502c38843b5e3cbb6db71e767       NA
 53ac5a289bfc00c5c83f60fc42bea356       NA

This means you have cells in your data that are NA. I don't know what caused these, perhaps students did not answer some questions. I assume you will want to score these as zero points. A possible remedy could be:

# recode NA to the string '<missing>'
# because it's always best to be explicit
hbooklet[is.na(hbooklet)] = '<missing>'

# use auto_add_unknown_rules = TRUE, to automatically add a zero score rule to any responses 
# not listed in the rules
report = add_booklet(db, hbooklet, 'bigTest', auto_add_unknown_rules=TRUE)

# the add_booklet function returns a summary of the import, we can inspect this
# to be sure nothing untoward happened
report$zero_rules_added |> dplyr::count(response)
# A tibble: 1 × 2
  response      n
  <chr>     <int>
1 <missing>    46

The original error message in your issue is very uninformative and should never occur. I would like to know if you used the most recent dexter version and, if you did, could you make a reprex or provide a dataset that replicates this error?

WrdeVries commented 1 year ago

Thank you for the elaborate answer. Very helpful. I have since tested the last data file I sent. I appear to get the same result as you do, so all good there. However, my initial issue is thereby still not solved. Therefore I have this time created new data objects that do (hopefully) give the same error as my initial problem. I will add them here. My experiments were done on the dexter_1.2.2 version (current). rdata.zip

I will try your solution on the original set and see whether that changes anything.

jessekps commented 1 year ago

I tried your original dataset and I can replicate your error. The original error message occurs because your column person_id is an integer64 datatype, rather than the common standard 32 bit R integer (probably because you pulled it from a database). I will update dexter so 64 bit integers get handled correctly in the next version.

In the meantime you can circumvent this issue by simply converting to chracter.

load("orules.rdata")
load("obooklet.rdata")

db = start_new_project(orules,':memory:')

obooklet$person_id = as.character(obooklet$person_id)

add_booklet(db, obooklet, "bigTest")

this leads to the following error:

Error: UNIQUE constraint failed: dxAdministrations.person_id, dxAdministrations.booklet_id

This is also not an error message that I like so I will improve it for the next dexter version. However, it means that your person_id column has duplicates (the same person made the same booklet multiple times) and this is not allowed.

> nrow(obooklet)
[1] 950
> dplyr::n_distinct(obooklet$person_id)
[1] 519

That is a data issue that you will have to resolve yourself but many thanks for bringing this to our attention.

jessekps commented 1 year ago

fixed in 60f927b