Open tracits opened 4 years ago
I think we need to take a call on to what extent problems like this should be dealt with by the tool or the user. One way would be to modify the database before importing, i.e. that if the user adds a field to the codebook then it’s up to the user to make sure that the database matches the codebook, for example by adding that column manually to the database and set all to unknown. It’s easy to see how that may cause annoying workflows on the user’s end however. If we choose to deal with it on the tool side then there needs to be a way to check that a database most likely belongs to a specific codebook. The suggestion to add valid_from
and valid_to
columns could fix this. These dates could then be compared to when the database was created I guess.
Agreed, or each codebook could have a version-number and each record has a reference to the codebook version, but then we are starting to talk about a bit more advanced workflow for the codebooks which, even though technically appealing, we may want to avoid considering ODTB. Then the valid from and to solution may be a possibility.
Right now, we added the extra variables for what emergency-room the patiens where revived at in TAFT, that creates about I think about 1700 records that need manual updating in the tool. We could make a workaround for that specific scenario by modifying the datasets before import, but may be a case to keep in mind for the future non the less.
With current logic, if you add a new variable to the codebook, all other records before this new edition will not have this variable. When importing a dataset, or working with a loaded one with the updated tool, all older records will be invalid sine they are missing the newly defined field. As of now, all records need to be updated manually in this case.
How do we solve this in the best way? It's very possible that the codebook may change over time.
One way would be to assign two more columns (if we are talking CSV-language) that defined valid_from and valid_to for variables in the codebook. This allows you to check if the variable was valid when the record was created (or maybe from doar time), if the record is older than this, these fields in the record are recorded as N/A at import and export.