Closed clnsmth closed 5 years ago
Thanks Colin for alerting the issue. What's the data_package_read
errors?
I'd appreciate the DOI too.
On Mon, Jul 29, 2019 at 3:21 PM Colin Smith notifications@github.com wrote:
Assigned #61 https://github.com/IMCR-Hackathon/datapie/issues/61 to @atn38 https://github.com/atn38.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/IMCR-Hackathon/datapie/issues/61?email_source=notifications&email_token=AKAZD5UDYGQGGB2NL7QG3HTQB5GLRA5CNFSM4IHV4GRKYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOSYIR22I#event-2517704041, or mute the thread https://github.com/notifications/unsubscribe-auth/AKAZD5R4DOM5JWYGIFGRP5DQB5GLRANCNFSM4IHV4GRA .
Sure thing @atn38.
Logic at line 16 of use_missing_code()
is configured such that the code within this if
statement will never run. Specifically, use of in
rather than %in%
. These errors only occur for data packages containing attribute_metadata
objects with missingValueCode
(i.e. the outer if
logic is working correctly.
doi:10.6073/pasta/c964ed49ff284dfcaaf53719651da60f
(works)
doi:10.18739/A2DP3X
(errors)
Error:
Error in matrix(if (is.null(value)) logical() else value, nrow = nr, dimnames = list(rn, :
length of 'dimnames' [2] not equal to array extent
Called from: matrix(if (is.null(value)) logical() else value, nrow = nr, dimnames = list(rn,
cn))
While a lot of datapie
's functions aren't readily testable, some are. Seems like this one could be. Here is a great resource on writing unit tests for R. I'll bring up the need for unit testing in our Thursday meeting.
@clnsmth, thanks for the suggestion. I followed it but didn't seem to work. Then I found that there is a mismatch in the second data table in doi:10.18739/A2DP3X
between the attributeName
listed in metadata versus the column names in the data. This was the source of the error. I'll rewrite use_missing_code
to not rely on names in the two places always matching up. Probably will include fuzzy matching and/or order matching of some sort.
Hi @atn38, the above example will not reproduce the error on the development branch because I commented out the call to use_missing_code()
in the data_package_read()
function (see https://github.com/IMCR-Hackathon/datapie/issues/61#issue-474238099).
Yes, incongruence between data and metadata column names makes programmatic workflows challenging! This is a prime example of the role quality metadata provides to data reuse!
Hi @atn38, please test and revise
use_missing_code()
when you get a chance. The logic doesn't seem quite right and was resulting indata_package_read()
errors. You'll have to uncomment this block of code indata_package_read()
when it's working again. Thanks!