uvacw / inca

24 stars 6 forks source link

fix for language issues with lda #330

Closed FeLoe closed 6 years ago

FeLoe commented 6 years ago

At least temporary fix (language can now be inserted as argument) - SnowballStemmer instead of PorterStemmer for other languages but no equivalent for lemmatizer...

damian0604 commented 6 years ago

I thought it's a better idea to not hard-code the language but rather read it from the settings.cfg file. See latest commit. What do you think, @FeLoe ?

FeLoe commented 6 years ago

Yes, I think it is a good idea - I just had to change the name of the settings file, now everything is working!

damian0604 commented 6 years ago

Thanks! I'll revert that though, because default_settings is just an example. When installing INCA, default_settings.cfg is automatically copied to settings.cfg. So only people with older installations (like us) have to change settings.cfg by hand. This is because settings.cfg can also contain login info and so on, therefore it's not supposed to be on github.

FeLoe commented 6 years ago

Ah ok - its just that I got an error when trying the code - maybe we need an explanation for this somewhere?

damian0604 commented 6 years ago

There is, in fact, an issue for that: https://github.com/uvacw/inca/issues/244 It's just that no one has had the time to pick it up (and/or a great idea of how to solve it)

damian0604 commented 6 years ago

I'm gonna fix #244