TheStanfordDaily / archives-web

Helper functions and web app for METS/ALTO archive viewing.
https://archives.stanforddaily.com
6 stars 2 forks source link

some articles do not match up with text #88

Closed ufxela closed 4 years ago

ufxela commented 4 years ago

e.g. https://d148m2cwwi25lc.cloudfront.net/1894/01/08?page=1&section=MODSMD_ARTICLE10#article

not sure how often this occurs, but I couldn't find another example in the 5 minutes i spent looking.

ufxela commented 4 years ago

60 might be helpful in determining how big an issue this is

ufxela commented 4 years ago

ok this is actually a pretty big problem, and it's because articles from day N are being linked back to article texts from day N-1

epicfaace commented 4 years ago

Also, the link you linked to tries to grab text from https://raw.githubusercontent.com/TheStanfordDaily/archives-text/master/1894/01/07/MODSMD_ARTICLE10.article.txt, but this file doesn't even exist. In fact, we don't have a day for Jan 7, 1894 in the archives-text repo: see https://github.com/TheStanfordDaily/archives-text/tree/master/1894/01