edgi-govdata-archiving / web-monitoring

Documentation and project-wide issues for the Website Monitoring project (a.k.a. "Scanner")
Creative Commons Attribution Share Alike 4.0 International
105 stars 17 forks source link

Not receiving Scanner output emails #152

Closed gretchengehrke closed 4 years ago

gretchengehrke commented 4 years ago

Hi @Mr0grog and @lightandluck , I haven't received the weekly Scanner outputs in a few weeks now. Rob has uploaded the last couple of months' worth of Scanner outputs, but I had still be receiving them until maybe a month ago. Has there been an issue with IAWM indexing or something? Thanks! Gretchen

Mr0grog commented 4 years ago

Yes, every thing is currently broken at the moment: https://archivers.slack.com/archives/CDBAZKJJU/p1588035429003800

Mr0grog commented 4 years ago

As a general update on this: the old-style e-mails broke because we don’t have enough memory on the machine running to process them. This is partially caused by increasing amounts of small changes (so there’s more and more data to crunch) and partially caused by the fact that they run on the same machine that imports data from Wayback, a process that has gotten longer and longer and slower slower and more memory intensive as we work around Wayback issues.

In any case, we are no longer really using those old sheets, so I’ve simply gone ahead and turned off that functionality.