SkBlaz / rakun

Rank-based Unsupervised Keyword Extraction via Metavertex Aggregation
GNU General Public License v3.0
99 stars 23 forks source link

Information about keywords that are 2-grams or 3-grams #4

Closed igormis closed 4 years ago

igormis commented 4 years ago
  1. I have tried the code and it is impressive, however I have some difficulties to show in the output not single words, but 2-grams or 3-grams that represent a keyword for specific text. Any info on how to tune the hyperparameters?

  2. Concerning the hyperparameters could you explain some of them, such as:

    • "pair_diff_length":2,
    • "bigram_count_threshold":2,
    • "num_tokens":[1,2],
    • "max_similar" : 3, ## n most similar can show up n times
    • "max_occurrence" : 3} ## maximum frequency overall

Thanks.

SkBlaz commented 4 years ago

Hello @igormis !

Thanks for the issue. Indeed, I'm realizing the hyperparameters were poorly explained. Hence, I've added more detailed descriptions here: https://github.com/SkBlaz/rakun#hyperparameter-explanation

Further, I've added an example where only 2-gram keywords are detected here: https://github.com/SkBlaz/rakun/blob/master/examples/multi_term.py

Hopefully, this resolves the confusion.

igormis commented 4 years ago

@SkBlaz perfect, tnx. What about lemmatize, can it be used as hyperparam and in which way?

SkBlaz commented 4 years ago

Indeed, I forgot that one. I've updated the multi_term.py to include lemmatization, as well as the hyperparameters description. Hope this helps!

On Fri, Jul 17, 2020 at 9:30 AM igormis notifications@github.com wrote:

@SkBlaz https://github.com/SkBlaz perfect, tnx. What about lemmatize, can it be used as hyperparam and in which way?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SkBlaz/rakun/issues/4#issuecomment-659921770, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACMSERHBP6T4BXGC6JM7LXTR3742VANCNFSM4O5VW5PA .

igormis commented 4 years ago

Perfect, now its much more understandable, tnx @SkBlaz

SkBlaz commented 4 years ago

No problem! I'd suggest you close the issue if you deem it solved, thanks!