Closed eliangcs closed 9 years ago
There's another bug that may share the same root cause of this bug. When running pystock-crawler reports
for a long time, say crawling 5k+ symbols, many of the filings are bypassed since the parser can't obtain the document type via xpath, even though I'm sure the document type is 10-Q or 10-K. This bug is reproducible only if you have a large list of input symbols. And it seems to happen to the same set of filings, those which come latter in crawling order.
After crawling EDGAR for hours using
pystock-crawler reports
command, it has a great possibility that a lot of these warning messages show up in the log:This makes those reports have many null values. Perhaps it is because the crawler hits EDGAR too often, making EDGAR return bad content.