Open traverseda opened 4 years ago
I would love to see this as well.
There exists actually an attempt to compute the page rank in YaCy. The process is somewhat hidden and also disabled by default.
process_sxt
is activated. This is deactivated by default. Activating postprocessing can therefore be done by activating that field (can be done in the front-end)The implementation can be considered as experimental-only. The result was, that a computation would increase the level of IO and CPU activity on the user side in such a great amount that the normal user would not accept the application any more. Therefore it was deactivated.
I would consider to run a new implementation of pagerank as process for YaCy Grid. "legacy" YaCy would not be the right place for such intensive computations. It's too bad but you also have to consider user-acceptance.
You are free to activate the feature for experiments, I would love to get your input here.
It looks like yacy stores enough metadata to implement pagerank, although I'm not sure how you could implement it in a distributed system. Seems like it might improve the relevancy of results? I'm honestly not sure how the ranking works internally now.