OSP123 / myspace-lookup-tool

0 stars 0 forks source link

Research Way Back Machine APIs #12

Closed OSP123 closed 5 months ago

OSP123 commented 5 months ago

User Story: As a developer, I would like to understand the Way Back Machine APIs and similar technology in order to see if we can pass the data to the api and display the actual page into our application

Acceptance Criteria:

OSP123 commented 5 months ago

According to a recent article from 2023, we are able to retrieve a web page and its cached contents by utilizing the Wayback Machine API: https://importsem.com/get-cached-pages-via-python-and-wayback-machine-api/#:~:text=To%20retrieve%20cached%20pages%20programmatically,content%20for%20the%20specified%20date.

We are able to retrieve the urls that we need. However, the API takes you to the URL and doesn't return the HTML itself. In order to retrieve just the content, we might have to scrape the web page.