controversial / wikipedia-map

A web app for visualizing the connections between Wikipedia pages.
https://wikipedia.luk.ke
MIT License
446 stars 78 forks source link

Get more links for selected nodes #41

Open naught101 opened 4 years ago

naught101 commented 4 years ago

The current script gets the links from the first paragraph, but this is sometimes not particularly useful. For example, Dog only returns "Carl Linnaeus" (this might be a bug though, because the first paragraph of https://en.wikipedia.org/wiki/Dog has more links than that..).

It would be good to be able to (optionally) use more paragraphs to rip links from, so that nodes with weak first paragraphs can be expanded..

Also, I wonder if it wouldn't be better to use the first 3 paragraphs by default. I have a local copy that gets the first three, and it seems to capture a much more representative set of links..

naught101 commented 4 years ago

Even better, obviously, would be a way of ranking links by their importance, but I'm not sure that Wikipedia's data structure allows that kind of analysis..

controversial commented 4 years ago

The issue with “Dog” like it’s a bug; I’ll take a look.

I’m not sure that there’s an easy way to add a “more links” functionality without compromising usability. The current system is good in part because in use, the actual structure of the page being parsed is opaque to the user, and one can use the app as a knowledge interface without giving any regard to the denser form of Wikipedia if one chooses. Adding more features that explicitly reference the structure of the Wikipedia page would remove this separation somewhat.

On Tue, Dec 31, 2019 at 12:29 AM naught101 notifications@github.com wrote:

Even better, obviously, would be a way of ranking links by their importance, but I'm not sure that Wikipedia's data structure allows that kind of analysis..

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/controversial/wikipedia-map/issues/41?email_source=notifications&email_token=ACPFRLY35YGL6KE7OMNPSMDQ3LKFHA5CNFSM4KBSWKR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH3XVXI#issuecomment-569866973, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPFRL2MQJVLHVKKZZTVZL3Q3LKFHANCNFSM4KBSWKRQ .