Right now some times the sites are not returning content on initial pass to scrape them. Need to try changing it so that the search is run again if the results fall below a certain threshold each search.
Might have to look at adding something like diffbot here to make sure the content comes in, in case the search returns a lot of results but there are no valid documents.
Right now some times the sites are not returning content on initial pass to scrape them. Need to try changing it so that the search is run again if the results fall below a certain threshold each search.
Might have to look at adding something like diffbot here to make sure the content comes in, in case the search returns a lot of results but there are no valid documents.