everypolitician / scraper_test

Data-driven scraper tests for Scraped
0 stars 0 forks source link

Handle :path field in instructions #25

Closed ondenman closed 7 years ago

ondenman commented 7 years ago

Part of: https://github.com/everypolitician/scraper_test/issues/13

A future PR (WIP: https://github.com/everypolitician/scraper_test/pull/26) will allow the testing of a subset of the response data.

In anticipation of that PR, this PR provides a way to specify that subset of data. The Instructions class now handles a list that can be used to target the required test data.

Although Instructions does not care about the contents of the :path field, it is intended to work like this:

The response data might contain multiple members within an array:

{ members: [{id: 1, name: 'Jim' }, { id: 2, name: 'Joe' }] }

A list can be specified in the :path field. To target Joe, the array would be:

:path:
    - :members
    - :id: 2

That is, the path specifies that the test wants a hash containing the id of 2 within the members array.

The path can be used to drill down any number of levels.

:houses:
  :lower:
    :members:
      - :id: 1
        :name: David Bowman
      - :id: 2
        :name: Frank Poole
  :upper:
    :members:
      - :id: 1
        :name: Victor Kaminsky
        :notes:
          - Survey Team Leader
          - Deceased
      - :id: 2
        :name: Jack Kimball
        :notes:
          - Geophysicist
          - Deceased
      - :id: 3
        :name: Charles Hunter
        :notes:
          - Astrophysicist
          - Deceased

To get at Dr. Kaminsky's notes, the path field would be:

  :path:
    - :houses
    - :upper
    - :members
    - :id: 1
    - :notes
tmtmtmtm commented 7 years ago

@ondenman do you have scrapers that need this level of drill-down, or are you just trying to find something that's extensible enough to cope with whatever comes up? I'm worried that this is quite complicated, and if something much simpler might be enough to ensure that we can get tests for 90% of scrapers, and then revisit the extra 10% once we've learned more about what's actually needed.

ondenman commented 7 years ago

@tmtmtmtm Good point. I see that I've over thought this.

ondenman commented 7 years ago

Closing in favour of https://github.com/everypolitician/scraper_test/pull/27