arifr1234 / wikipedia-graph

Graph/network interface to Wikipedia.
https://arifr1234.github.io/wikipedia-graph/
Other
19 stars 2 forks source link

Links that redirect to a different title. #1

Open arifr1234 opened 3 years ago

arifr1234 commented 3 years ago

For example: If you open the Wikipedia Graph page of "Sunflowers (Van Gogh series)" and click on the link: "Oil on canvas" you will be redirected to the page "Oil painting" but the title remains "Oil on canvas". If you will now open "Oil painting" you will be able to see that they both have the same contents but different titles. The reason for that is that the URL of the initial link: "Oil on canvas" was: https://en.wikipedia.org/wiki/Oil_on_canvas which normally immediately redirects to https://en.wikipedia.org/wiki/Oil_painting. The rest API took care of redirecting and getting the HTML content of "Oil painting", but when fetching the titles out to the links (to build the graph for instance), the pre-redirect title is considered. So as far as the graph concerns the "Oil on canvas" and "Oil painting" are different articles, even though "Oil on canvas" redirects to "Oil painting".

arifr1234 commented 2 years ago

This can be solved by using pageids as keys to the dictionaries instead of the page titles.