Closed reedrolemodel closed 2 months ago
I came here to report the issue! I'm glad to discover it already has a PR.
1) Static pages GET /de/cookies renders a cookie policy
Failure/Error:
text.to_s.gsub(/\A[[:space:]&&[^\u00a0]]+/, '')
.gsub(/[[:space:]&&[^\u00a0]]+\z/, '')
.gsub(/\n+/, "\n")
.tr("\u00a0", ' ')
ArgumentError:
invalid byte sequence in UTF-8
[Screenshot Image]: tmp/capybara/screenshots/failures_r_spec_example_groups_static_pages_get_de_cookies_renders_a_cookie_policy_91.png
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-playwright-driver-0.5.2/lib/capybara/playwright/node.rb:134:in `gsub'
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-playwright-driver-0.5.2/lib/capybara/playwright/node.rb:134:in `block in visible_text'
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-playwright-driver-0.5.2/lib/capybara/playwright/node.rb:83:in `assert_element_not_stale'
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-playwright-driver-0.5.2/lib/capybara/playwright/node.rb:120:in `visible_text'
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-3.40.0/lib/capybara/node/element.rb:60:in `block in text'
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-3.40.0/lib/capybara/node/base.rb:77:in `synchronize'
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-3.40.0/lib/capybara/node/element.rb:60:in `text'
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-3.40.0/lib/capybara/queries/selector_query.rb:603:in `matches_text_regexp'
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-3.40.0/lib/capybara/queries/selector_query.rb:607:in `matches_text_regexp?'
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-3.40.0/lib/capybara/queries/selector_query.rb:554:in `matches_text_filter?'
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-3.40.0/lib/capybara/queries/selector_query.rb:452:in `matches_system_filters?'
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-3.40.0/lib/capybara/queries/selector_query.rb:122:in `matches_filters?'
# /Users/mhenrixon/.gem/ruby/3.3.5/gems/capybara-3.40.0/lib/capybara/result.rb:32:in `block in initialize'
Some pages cause a
invalid byte sequence in UTF-8
exception to be raised when callingtext.to_s.gsub(/\A[[:space:]&&[^\u00a0]]+/, '')
. Addingscrub
prevents this.Specific context: It seems a
HTML entity gets interpreted as "\xA0", or byte 160, which has an invalid encoding. Usingcharlock_homes
the encoding of the entire page is reported asISO-8859-1
with 54% confidence.