[🐛 Bug]: Retrieving the text content of an element with screenreader-only text returns inconsistent results

a-ctor commented 7 months ago

What happened?

Hi,

we use the .visually-hidden definition from https://a11y-guidelines.orange.com/en/articles/accessible-hiding/ to provide accessibility texts that are only visible for screen reader users. We test our applications with Selenium and encountered an inconsistency that seems like a bug in Selenium. When retrieving the text content of an element, some of the visually hidden children are included while others are not.

Consider the provided HTML sample. It contains the previously mentioned .visually-hidden CSS class and four elements of note:

A span with text content Normal Text,
A second span with text content Normal Text 2 that is containd in a <p> element
a visually hidden span with text content Screen reader text,
and another visually hidden span with text content Screen reader text 2 that is contained in a <p> element

When retrieving the text content of the target element, the parent container of all four elements, I would expect the output to contain the text content of just element 1. and 2 (only the visible elements) or of elements 1., 2., 3., and 4. (all elements regardless of visibility).

Instead the text from 1., 2., and 3. are returned while 4. is excluded. For some reason, the <p> element prevents the 4. element to be included in the text content. This effect can also be reproduced by replacing the <p> with other elements like <pre> or <li>. My initial assumption was that display: block elements prevent this issue but other elements like <div> don't prevent this problem.

Here is a basic setup to reproduce this issue in C#, I don't know if the problem can be reproduced with other language bindings as well. Adjust the URL to point to a local file containing the HTML.

using System;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

namespace SeleniumRepro;

public static class Program
{
  public static void Main(string[] args)
  {
    var chromeDriver = new ChromeDriver();
    chromeDriver.Navigate().GoToUrl(@"<insert local file path here>");

    var text = chromeDriver.FindElement(By.Id("target")).Text;
    Console.WriteLine(text);

    chromeDriver.Close();
    chromeDriver.Dispose();
  }
}

Running this code, regardless of browser (Chrome/Edge/Firefox), will print:

Normal text
Normal text 2
Screen reader text

As mentioned before, I don't know if that is considered a bug but its seemingly arbitrary behavior seems like a bug to me. If you need anything more from me I am happy to help.

Cheers Patrick

How can we reproduce the issue?

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Selenium Repro</title>
</head>

<body>
    <style>
        <!-- Taken from https://a11y-guidelines.orange.com/en/articles/accessible-hiding/ -->
        .visually-hidden {
            position: absolute;
            position: absolute !important;
            width: 1px !important;
            height: 1px !important;
            padding: 0 !important;
            margin: -1px !important;
            overflow: hidden !important;
            clip: rect(0, 0, 0, 0) !important;
            white-space: nowrap !important;
            border: 0 !important;
        }
    </style>
    <div id="target">
        <div><span>Normal Text</span></div>
        <div><p><span>Normal Text 2</span></p></div>
        <div class="visually-hidden"><span>Screen reader text</span></div>
        <div class="visually-hidden"><p><span>Screen reader text 2</span></p></div>
    </div>
</body>

</html>

Relevant log output

n/a

Operating System

Windows 10

Selenium version

C# 4.18.1

What are the browser(s) and version(s) where you see this issue?

Chrome 122, Firefox 123, Edge 122

What are the browser driver(s) and version(s) where you see this issue?

ChromeDriver 122.0.6261.94, Microsoft Edge WebDriver 122.0.2365.52

Are you using Selenium Grid?

No response

github-actions[bot] commented 7 months ago

@a-ctor, thank you for creating this issue. We will troubleshoot it as soon as we can.

Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

harsha509 commented 6 months ago

Hi @titusfortner ,

I'm uncertain if this issue is specific to .NET. I attempted the example provided with selenium-webdriver (versions 4.10.x - 4.19.x), and I couldn't replicate the problem described. getText retrieved all the texts as below