Open chegmarco1989 opened 2 years ago
Hi @chegmarco1989
The data source is a nested array of integers or a graph if you like. These integers represent the IDs of the entities that being page ranked.
Btw, pagerank is not definitely for ranking webpages. It can rank entities if the relationship between these entities is known. It is called to "Page rank" because the name of the inventor is "Larry Page".
The functional tests shows the usage: https://github.com/PHP-Science/PageRank/blob/master/tests/functional/Service/PageRankAlgorithmTest.php#L86
The array contains a list of entities with their IDs. And it also contains the incoming and outgoing connections.
This method shows how to build up the object and where to put the data source: createPageRankAlgorithm And the method testRun shows the usage.
Too many entities or too high iteration number will consume more time to execute the algorithm. I believe, in real world, a pagerank algorithm runs in parallel in smaller topics - sometimes for weeks. (Also the optimised search algorithms weren't as efficient as the PHP builtin search algorithms.)
Thank you very much @DavidBelicza for your answer.
But what do you mean by "Also the optimised search algorithms weren't as efficient as the PHP builtin search algorithms" ???
Can you give us some examples of native PHP algorithms that can deliver results more efficiently than "PageRank" ???
Thank you for responding to us please.
Hi.
We want to know:
1 - How to use it in the case of ranking of web pages ??? What can we insert into the $datasource variable to successfully classify our web page ??? Should we just put the list of urls as datasource ???
2 - Is this package capable of classifying a large or large database of the order of millions or even billions of data ???
Thank you for informing us please.