Unlike #7, some departmental web page return 0 mementos, i.e., an empty TimeMap. These web pages or a precursor may have existed at a different URL; for example, from english.university.edu to university.edu/depts/english. Digging into the data will help to reveal these trends. This process will require consulting the historical captures of the universities' home pages, scraping or parsing out the departmental URLs, and identifying where the web page for the departmental web page once resided -- if at all.
We may encounter an issue of new departments at the university that did not exist in the past.
I anticipate this process to go as follows:
Identify a small set of departmental web pages that have no historical captures.
Using an initially manual process, look at the older mementos for the university's homepage to identify an instance of departments existing elsewhere.
Use a tool like BeautifulSoup to scrape/parse out the historical URL (URI-M) of the department and respective faculty pages.
Unlike #7, some departmental web page return 0 mementos, i.e., an empty TimeMap. These web pages or a precursor may have existed at a different URL; for example, from
english.university.edu
touniversity.edu/depts/english
. Digging into the data will help to reveal these trends. This process will require consulting the historical captures of the universities' home pages, scraping or parsing out the departmental URLs, and identifying where the web page for the departmental web page once resided -- if at all.We may encounter an issue of new departments at the university that did not exist in the past.
I anticipate this process to go as follows: