nvaccess / nvda

NVDA, the free and open source Screen Reader for Microsoft Windows
https://www.nvaccess.org/
Other
2.1k stars 636 forks source link

NVDA ignores line breaks in PDFs, making some types of text like source code unreadable with Adobe Reader #17313

Open Neurrone opened 1 week ago

Neurrone commented 1 week ago

Continuation of #7275

Steps to reproduce:

Download this file and try to read it with NVDA in Adobe Reader

Actual behavior:

This is NVDA's speech output on the third line

class Node: def __init__(self, value): 

Expected behavior:

Visually (and obvious from context), this should be two separate lines, like so:

class Node:
  def __init__(self, value): 

These are two separate lines visually in the file.

From https://github.com/nvaccess/nvda/issues/7275#issuecomment-308015865

PDF has semantic tags for paragraphs, lists, tables and the like. However, it does not differentiate author inserted line breaks (as in source code or poetry, sometimes known as hard line breaks) from line breaks used to wrap text which cannot fit on a single line (sometimes known as soft line breaks). Because NVDA splits text into lines itself (according to the "Maximum number of characters on one line" Browse Mode setting), we strip line break characters, as otherwise, you end up with a lot of long lines followed by short lines (as I recall happened in JAWS when I used it years ago). Having spoken to someone involved in PDF accessibility specification writing, my understanding is that the correct way to author such content is to tag each line as a separate list item or paragraph. Unfortunately, it seems no one actually does this in the wild. I think the only way we could reasonably solve this is to ignore NVDA's own settings for splitting lines and instead use only the line breaks in the PDF. That would also require us to not treat line breaks as paragraphs for PDF. This would be somewhat inconsistent with browse mode everywhere else, but I think consistency is probably outweighed by usability here.

NVDA logs, crash dumps and other attachments:

System configuration

NVDA installed/portable/running from source:

Installed

NVDA version:

alpha-34198,67f6cb99 (2025.1.0.34198)

Windows version:

Windows 11 23H2 (OS Build 22631.4317)

Name and version of other software in use when reproducing the issue:

Adobe reader 2024.003.20180

Other information about your system:

Other questions

Does the issue still occur after restarting your computer?

Yes

Have you tried any other versions of NVDA? If so, please report their behaviors.

Yes, this has been an issue since 2017

If NVDA add-ons are disabled, is your problem still occurring?

Yes

Does the issue still occur after you run the COM Registration Fixing Tool in NVDA's tools menu?

Yes

Neurrone commented 2 days ago

Should this be P2 instead? I would imagine that reading PDFs with Adobe Reader is somewhat common.