seunglee98 / fedmatch

Other
27 stars 10 forks source link

Problem in using "wgt_jaccard_distance" when doing multivari match #24

Open xiaomaohao opened 2 years ago

xiaomaohao commented 2 years ago

when I did multivari match by using the match_type of "wgt_jaccard_distance", the following error came out:

Error in checkForRemoteErrors(val) : 
  8 nodes produced errors; first error: column(s) not found: ele_school_compare, middle_school_compare, high_school_compare

cant figure out why, help please.

update: after browsing the code, I found the type "wgt_jaccard_distance", mentioned in the vignette, might be wrong. It should be "wgt_jaccard_dist", after I changed it. The above error is gone, but there is a new one.

Error in checkForRemoteErrors(val) : 8 nodes produced errors; first error: string_1 and string_2 must be the same non-zero length.

c0webster commented 2 years ago

Thanks for this! Great catch on wgt_jaccard_dist, I've changed the documentation appropriately. Regarding your second error, this occurs during the multivar match. The check is length(string_1) != length(string_2) | length(string_1) == 0 | length(string_2) == 0

Depending on the match you're doing (like a tier match), you may have already removed all the possible matches, and it looks like it's trying to match a string vector of length 0. Are you doing a tier match?

xiaomaohao commented 2 years ago

I am doing multivar match rather than tier match, I have three string variables, I wanna match the three using "wgt_jaccard_dist". The error above comes out.

c0webster commented 2 years ago

Hm, okay. Can you post some code you're using? One thing to note is that you can only match 2 variables at a time, so I'm not sure how you're implementing a match with three string variables at once.