junhewk / RcppMeCab

RcppMeCab: Rcpp Interface of CJK Morpheme Analyzer MeCab
24 stars 8 forks source link

Check if the length of features is not less than 7 inside posDFRcpp and posParallelDFRcpp #11

Closed paithiov909 closed 3 years ago

paithiov909 commented 3 years ago

This PR aims to change these points.

  1. Replace Travis CI with GitHub Actions
  2. Change the location of libmecab.dll that the package looks up under windows
  3. [Bug fix] Check if the length of features is not less than 7 inside posDFRcpp and posParallelDFRcpp

Change the location of libmecab.dll that the package looks up under windows

I use library.dynm function instead of useDynLib since PR #8. library.dynm can be passed DLLpath argument, which changes the search order of dynamic libraries under windows so that it ensures to load libmecab.dll that this package has. However, this change might make an installation issue when test-loads of the package. So, I would like to revert this change.

[Bug fix] Check if the length of features is not less than 7 inside posDFRcpp and posParallelDFRcpp

I suppose this is Japanese specific problem. MeCab (with Japanese IPA dictionary) estimates features of unknown words by default, however, the estimated unknown features have no parts that correspond to 'Readings' and 'Pronunciation'. Because original node features of the IPA dictionary have 9 parts, estimated unknown features have only 7 parts. So, if the provided text has some unknown words, functions occur an error saying std::bad_alloc. To prevent this, I wrote checks if the length of features is not less than 7 inside posDFRcpp and posParallelDFRcpp. See also a related issue on other repo.

junhewk commented 3 years ago

@paithiov909 Thanks for the contribution. However, if ../inst/libs is changed to ../inst/mecab in Makevars.win, then the installation is failed in my system. Could you check this again? Otherwise, the update works well.

junhewk commented 3 years ago

@paithiov909 Your solution works in Rstudio build environment, but fails if I try to install the package in the repo directly. Let me check this again and I revert the Makevars.win file for now.

paithiov909 commented 3 years ago

It sounds strange to me. I could successfully install the package from your repository using remotes::install_github("junhewk/RcppMeCab"). I use R4.0.2 under windows.

Anyway, that is not critical for my commitment. I just changed ../inst/libs to ../inst/mecab so as to suppress warning messages claiming that you should not use the libs directory by yourself. However, that change is optional. Please revert it if necessary.