ukwa / w3act

w3act is an annotation and curation tool for building web archive collections
Apache License 2.0
19 stars 6 forks source link

Limiting the number of secondary seeds #685

Open crarugal opened 2 years ago

crarugal commented 2 years ago

This ACT record cannot be accessed (504 Gateway Time-out); this is because this target has many secondary seeds. https://www.webarchive.org.uk/act/targets/138705

It has 480 secondary seeds; perhaps we should limit the number of secondary seeds.

image

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

id | url | created_at | title | author_id | professional_judgement | depth | ignore_robots_txt | crawl_frequency | crawl_start_date | crawl_end_date | license_status | updated_at -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- 138705 | act-8873953430350088168 | 30 April 2021 | Candidates standing for Constituencies on Facebook #1 | 36 | TRUE | CAPPED | TRUE | DAILY | 01 May 2021 | 15 May 2021 | NOT_INITIATED | 30 April 2021