Donatello-za / rake-php-plus

A keyword and phrase extraction library based on the Rapid Automatic Keyword Extraction algorithm (RAKE).
MIT License
260 stars 48 forks source link

Is it possible to do keywords extraction from links #10

Closed thekubilay closed 5 years ago

thekubilay commented 5 years ago

I want to find the keywords which have been used for a website search, and identify the most commonly used ones. Is it possible to do text mining on website links with this API

Donatello-za commented 5 years ago

That is beyond the scope of this library. You would have to use another library/solution to perform the web scraping and extract all the text from the page/website. Once you've done that you can use this package to extract all the keywords from the text. Let me know if this answers your question.

thekubilay commented 5 years ago

Thank you for returning me back. Actually web scraping is no problem. I can scrape a page. But I don't know how to find keywords that people use to access that website. And inside those keywords what is the most used ones. is it possible? So google wont give me those keywords but can I find by myself? with this library or another library.

My purpose here, If i find most used keywords that people use to access that website. I can used keywords in my website meta...

Donatello-za commented 5 years ago

The library allows you to take some text and extract useful words and phrases from it and to discard all common words such as "and", "or", "are", "does", etc. So lets say a person Google searched for: "Why is my Ford Bronco sputtering after starting it in the mornings?"

You can then get useful phrases from it like so:

require 'vendor/autoload.php';

use DonatelloZa\RakePlus\RakePlus;

$text = "Why is my Ford Bronco sputtering after starting it in the mornings?";

$phrases = RakePlus::create($text)->get();

print_r($phrases);

This will give you:

Array
(
    [0] => ford bronco sputtering
    [1] => starting
    [2] => mornings
)

You can of course extract the individual keywords instead of phrases as well:

$phrases = RakePlus::create($text)->keywords();

Which will give you:

Array
(
    [0] => ford
    [1] => bronco
    [2] => sputtering
    [3] => starting
    [4] => mornings
)

Having extracted these keywords, you can store it in a database and keep counters on them. I have used it in the past to extract keywords from emails, to build search indexes and "tags" from forms that was filled in by users on my websites, etc.

So to summarise, the only thing this library does is extract useful keywords and phrases from a piece of text and to discard words/phrases that is not useful.

thekubilay commented 5 years ago

I am really thankful for the information that you gave me. This library what I want.

require 'vendor/autoload.php';

use DonatelloZa\RakePlus\RakePlus;

$text = "Why is my Ford Bronco sputtering after starting it in the mornings?";

$phrases = RakePlus::create($text)->get();

print_r($phrases);

this usage is in controller or in blade? I think I don't need a controller to use it. Just calling from blade? Sorry for the burden but I couldn't understand the explanation.

Donatello-za commented 5 years ago

My example was to run from the command line. But yes, you can use it in your Laravel controller. So after installing through composer you can do something like this:

<?php
namespace App\Http\Controllers;

use Illuminate\Http\Request;
use DonatelloZa\RakePlus\RakePlus;

class ArticleController extends Controller
{
    /**
     * Store a new article post.
     *
     * @param  Request  $request
     *
     * @return Response
     */
    public function store(Request $request)
    {
        $text = $request->input('article_text');
        $keywords = RakePlus::create($text)->keywords();

        // Store the array of keywords to a database table...
        // ....
    }
}
thekubilay commented 5 years ago

okay, thank you! these help a lot :)