medialab / hyphe

Websites crawler with built-in exploration and control web interface
http://hyphe.medialab.sciences-po.fr/demo/
GNU Affero General Public License v3.0
328 stars 59 forks source link

Edge identification #415

Closed ralphverwegen closed 3 years ago

ralphverwegen commented 3 years ago

I'm interested in identifying the specific hyperlink(s) that constitute certain edges. In other words, I want to know exactly what parts of a website were linked to. It this possible with hyphe?

Thank you for your time

boogheta commented 3 years ago

Hello @ralphverwegen, This information is stored in details by hyphe but is not easily retrievable as such so far. From the frontend web interface you cannot, but you can access via the backend API the detailed links between pages for each webentity using the route store.paginate_webentity_pagelinks_network as documented here: https://github.com/medialab/hyphe/blob/master/doc/api.md#pages-links-and-networks

ralphverwegen commented 3 years ago

Thank you @boogheta

I appreciate your time.