salsadigitalauorg / merlin-framework

Merlin - migration framework
GNU General Public License v3.0
16 stars 3 forks source link

Error reporting when multple selectors are used is wrong #110

Closed stooit closed 4 years ago

stooit commented 4 years ago

Describe the bug When using multiple selectors (first match > single field) the page is incorrectly tracked in the error-not-found report even if one of the selectors match

Sample configuration

---
domain: https://www.example.com

urls:
  - /about
  - /sports/rugby

entity_type: standard_page
mappings:
  -
    field: title
    selector:
      - '//div[@id="content"]//h1[1]'
      - '//div[@id="content"]//h2[1]'
      - '//div[@id="content"]//h3[1]'

Expected behavior If any of the selector matches are found the element should not appear in the 'not-found' report

Screenshots none

Additional context none

derklempner commented 4 years ago

Just to expand this, this occurs if the not-found selector appears before the found ones, e.g.

selector: 
      - '//div[@class="contentPage-content-area-that-doesnt-exist"]//h1[1]'      
      - '//div[@class="contentPage-content-area"]//h1[1]'
      - '//div[@class="contentPage-content-area"]//h2[1]'
      - '//div[@class="contentPage-content-area"]//h3[1]'
derklempner commented 4 years ago

This should be addressed in feature/issue-110-multiple-selectors-error-reporting.

It needed a reasonable change to how TypeBase::process() is called and multiple selectors are no longer inflated on load as separate (faked) field selectors, so will need to be put through some more real world testing.