koheiw / newsmap

Semi-supervised algorithm for geographical document classification
Other
58 stars 22 forks source link

Add turkish #77

Closed koheiw closed 1 month ago

koheiw commented 2 months ago

Add TR dictionary

koheiw commented 2 months ago

@LungtaSEKI, thanks for you PRs. I merged both and getting ready to release in a new version of the package.

It is failing the test because the dictionary does not match "Great Britain". Do you think you need to edit the dictionary?

> txt_tr <- c("Bu İrlanda hakkında bir makale.",
+             "Bu Büyük Britanya hakkında bir makale.")
> toks_tr <- tokens(txt_tr)
> tokens_lookup(toks_tr, data_dictionary_newsmap_tr, levels = 3)
Tokens consisting of 2 documents.
text1 :
[1] "IE"

text2 :
character(0)
LungtaSEKI commented 1 month ago

@koheiw My aplogies for getting back to you so late. I have just added Britanya and corrected other typos. I'd really appreciate it if you could check them.

koheiw commented 1 month ago

@LungtaSEKI Where can I find the updated file? It is easiest if you could edit file on this branch. You can clone this repository, and commit and push directly because you are a collaborator.

koheiw commented 1 month ago

Done. Thanks @LungtaSEKI.