Open jkrss opened 5 years ago
Hi,
I would like to crawl and scrape the content of a whole website. This is the code:
Rcrawler(Website = URL, no_cores = 4, no_conn = 4, ExtractXpathPat = c("//./div[@class='bodytext']//p", "//./h1[@class='blogtitle']", "//./div[@id='kommentare']//p"), PatternsNames = c("article", "title", "comments"), ManyPerPattern = TRUE)
After retrieving approx. 19% of the data I get the following error message:
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 1, 0
It always happens at the same point, DATA and INDEX are created correctly with all entries crawled until the error message.
Am I doing something wrong or is it something with the website I would like to crawl? I am using Rcrawler version 0.1.9-1.
Thanks for helping me out!
Hi,
I would like to crawl and scrape the content of a whole website. This is the code:
Rcrawler(Website = URL, no_cores = 4, no_conn = 4, ExtractXpathPat = c("//./div[@class='bodytext']//p", "//./h1[@class='blogtitle']", "//./div[@id='kommentare']//p"), PatternsNames = c("article", "title", "comments"), ManyPerPattern = TRUE)
After retrieving approx. 19% of the data I get the following error message:
It always happens at the same point, DATA and INDEX are created correctly with all entries crawled until the error message.
Am I doing something wrong or is it something with the website I would like to crawl? I am using Rcrawler version 0.1.9-1.
Thanks for helping me out!