Closed jchristgit closed 7 years ago
I'm having a bunch of issues with this.
Apparently, scrapy only runs contracts inside the docstring of parse
while ignoring the others. Since the cppreference spider uses the parse
method as an entry point for distributing different types of pages to their respective parsers, this isn't really effective.
I currently have the following (method bodies cut out for readability):
def parse(self, response):
"""
@url http://en.cppreference.com/w/cpp/symbol_index
@returns requests 1
"""
...
def parse_symbol_index(self, response):
"""
@url http://en.cppreference.com/w/cpp/symbol_index
@returns requests 700
"""
...
def parse_std_symbol(response):
"""
@url http://en.cppreference.com/w/cpp/symbol_index
@returns items 740
@scrapes names defined_in_header sigs desc return params example link
"""
...
All that scrappy returns when running scrapy check cppreference -v is
:
[cppreference] parse (@returns post-hook) ... ok
----------------------------------------------------------------------
Ran 1 contract in 4.883s
OK
... which is not the intended function of this.
Closing this since the spider contracts do not work correctly for this. Instead, after merging the branch, we will add a test suite for various functions that were built for the scrapers as well as various utility functions.
Spider Contracts enable us to easily validate different parsing functions from our Spiders. Adding these would help ensure that the parsers function correctly and reduce testing needs from our end.