kosukeimai / fastLink

R package fastLink: Fast Probabilistic Record Linkage
258 stars 46 forks source link

cann't run gammaCKpar on a simple comparion instance #18

Closed LUSAQX closed 6 years ago

LUSAQX commented 6 years ago

Hi,

Please check the code as follows which returned

Error in { : task 1 failed - "subscript out of bounds"

dfA=list(c('A2 Infant','Express Logistics'))
names(dfA) = c('name')
dfB=list(c('A2 Infant','Cargo Solutions'))
names(dfB) = c('name')
g_firstname <- gammaCKpar(dfA$name, dfB$name)

But when shifting to function gammaCK2par

it could work.

Could you please advise how to fix the code to recall function gammaCKpar well on the instance above?

Cheers

LUSAQX commented 6 years ago

I have got the solution, i.e. customize the cut.a and cut.p value to replace the default.

tedenamorado commented 6 years ago

Thanks a lot for raising this point!

You are completely right. If there a not observations between the first and second cutoffs, then the function should fail. We will add a more informative error message when this happens.

tedenamorado commented 6 years ago

Hi, I just wanted to let you know that thanks to the error you brought our attention, we have improved gammaCKpar. If I am not wrong you are using a Windows machine, that is why you got an error message related to the parallelization of the function. More important than the parallelization problem is the fact that we did not have a proper warning message telling users that sometimes a partial match does not work given the default settings. Thanks to you, we have pushed an updated version of gammaCKpar that includes such a warning.

Please, if you find any further issues when using fastLink, just say the word!

Thanks a lot for your great feedback and Happy holidays!