LIAAD / yake

Single-document unsupervised keyword extraction
https://liaad.github.io/yake
Other
1.66k stars 230 forks source link

Porting it to other language: Swift for iOS #67

Open mrigankgupta opened 2 years ago

mrigankgupta commented 2 years ago

I was looking if I can port Yake to iOS/MAC. I am novice in python and data science 😿. I am assuming that the main logic written pke->Yake.py. but then there is one more file Yake->Yake.py. What is the difference between two? I am not finding last one referencing the first one. Can anyone point me to some more resources which I can read?

https://asset-pdf.scinapse.io/prod/2790109590/2790109590.pdf https://medium.com/gumgum-tech/exploring-different-keyword-extractors-statistical-approaches-38580770e282

arianpasquali commented 2 years ago

Nice that you are trying to write YAKE in Swift. I would recommend taking a look at the short paper to have a better understanding of what it does and why. For more detailed explanation you can also check the journal paper.

If you need more examples besides this repository you can check an alternative Python implementation by Florian

or the Scala implementation by JohnSnow Labs for the SparkNLP framework.

Cheers

mrigankgupta commented 2 years ago

Thanks for the links! I want to ask one thing regarding the preprocessing, can we discard the chunks which are un-parsable and digits? I might be wrong here, but I observe that we are not using them in later. I am using NaturalLanguage package from Apple's library and its little different them segtok's web_Tokenizer.

mrigankgupta commented 2 years ago

@arianpasquali I am pretty much done with my raw port. but I guess I am missing something here as my results are not matching( at least after top 2-3) specially the candidates with multiple terms.

bryan1anderson commented 2 years ago

Following along here. I'm also hoping to utilize YAKE in Swift. @mrigankgupta I'm happy to help get in to what you've got going

bryan1anderson commented 2 years ago

@mrigankgupta Please let me know if you'd like help or a tester! I'm likely going to have to port it over to Swift too and would love if we could share code