Closed Rossil2012 closed 4 years ago
{
"expected_update_period_in_days": "40",
"url": [
"http://electsys.sjtu.edu.cn/edu/"
],
"type": "text",
"mode": "on_change",
"extract": {
"link": {
"regexp": "<a href\\=\\'(.+?htm)\\'",
"index": "1"
},
"date": {
"regexp": "date\\\"\\>(\\(.+?\\))\\<\\/font>",
"index": "1"
},
"title": {
"regexp": "title\\=\\'(.+?)\\'class\\=\\\"news\\\">",
"index": "1"
}
},
"template": {
"title": "{{title}} {{date}}"
}
}
Error when fetching url: Failure when receiving data from the peer
Means there is a connection problem between your Huginn server and the website you are trying to fetch.
The working status is only an indicator for the user, Huginn itself will still schedule the Agent as it normally does based on the Agent configuration. We don't have a function that immediately retries the Agent after a failure happened.
{ "expected_update_period_in_days": "40", "url": [ "http://electsys.sjtu.edu.cn/edu/" ], "type": "text", "mode": "on_change", "extract": { "link": { "regexp": "<a href\\=\\'(.+?htm)\\'", "index": "1" }, "date": { "regexp": "date\\\"\\>(\\(.+?\\))\\<\\/font>", "index": "1" }, "title": { "regexp": "title\\=\\'(.+?)\\'class\\=\\\"news\\\">", "index": "1" } }, "template": { "title": "{{title}} {{date}}" } }
Thank you, but I think the reason of the error is due to Internet connenction but not the extracting grammar. Dsander has answered my question.
Error when fetching url: Failure when receiving data from the peer
Means there is a connection problem between your Huginn server and the website you are trying to fetch.
The working status is only an indicator for the user, Huginn itself will still schedule the Agent as it normally does based on the Agent configuration. We don't have a function that immediately retries the Agent after a failure happened.
Thank you, I got it.
When using Website Agent to scrape data from "http://electsys.sjtu.edu.cn/edu/", the error below will occur occasionally, setting the working status to "No". Though the periodical check will not be stopped and new events will refresh the working status, the knowledge of the website will be not so up-date. Thus I am wondering if I can restart the agent as soon as the error occurs.
My code is here:
And the log is here: