frescobaldi / qpageview

page-based viewer widget for Qt5/PyQt5
https://qpageview.org/
GNU General Public License v3.0
20 stars 9 forks source link

[PyQt6] Rubberband.selectedText() #28

Open igneus opened 2 months ago

igneus commented 2 months ago

In (current qt6) Frescobaldi I select a portion of the musicview by dragging the mouse with right button pressed. Then I right-click the selection to get the context menu. "Copy Selected Text" menu item doesn't appear, although the selection contains text - i.e. Rubberband.selectedText() probably doesn't work

bmjcode commented 2 months ago

I'll have to look, but I'm guessing this is a Poppler-to-QtPdf translation issue like #29.

bmjcode commented 2 months ago

I think the problem is in PdfPage.text() -- specifically, not knowing what units to use for QPdfDocument.getSelection(). ~Searching online just returns its unhelpful documentation and a forum post or two from others who don't know either.~

Edit: QPdfDocument measures page size in points, so I'm presuming those are the units for anything not otherwise specified. Now can someone clarify the units for our rect before and after self.mapFromPage().rect()?

This next part may or may not be helpful

"Copy Selected Text" does kind of work if you change the first line of PdfPage.text() from

        rectf = rect.toRectF()

to

        rectf = self.mapFromPage(self.pageWidth, self.pageHeight).rect(rect)

like its counterpart in poppler.py. I say it only "kind of" works because the text it copies is rarely the text you've highlighted.

bmjcode commented 2 months ago

This may be a bug in Qt. Here's a test program:

import sys

from PyQt6.QtCore import QCoreApplication, QRectF, QPointF
from PyQt6.QtPdf import QPdfDocument

a = QCoreApplication([])
doc = QPdfDocument(a)
doc.load(sys.argv[1])

# Get a QTextSelection for all text on the first page
everything = doc.getAllText(0)

# Where on the page is the text found?
rect1 = everything.boundingRectangle()
print("rect1 =", rect1)

# Now attempt to select all text in that area manually
# (if this works, rect2 == rect1)
selection = doc.getSelection(0, rect1.topLeft(), rect1.bottomRight())
rect2 = selection.boundingRectangle()
print("rect2 =", rect2)

# This has no practical value besides checking if QPdfDocument works
# since we can't convert indexes to page coordinates
selection = doc.getSelectionAtIndex(0, everything.startIndex(), everything.endIndex())
rect3 = selection.boundingRectangle()
print("rect3 =", rect3)

The expected result is rect1 and rect2 are equal, and rect3 is at least reasonably close (it may vary slightly because it's looking up by index in QTextSelection's internal list of strings rather than page coordinates).

Instead, here's what I get testing it on one of my scores:

rect1 = PyQt6.QtCore.QRectF(17.0, 19.0, 567.0, 756.0)
rect2 = PyQt6.QtCore.QRectF()
rect3 = PyQt6.QtCore.QRectF(19.0, 19.0, 565.0, 732.0)

That's a pretty tiny area for rect2. No wonder we can't find any text in it.

Of course, this potentially being a Qt bug doesn't rule out separate bugs in my own code. :)

igneus commented 1 month ago

I assume that QPdfDocument.getSelection() uses the same units as the only QPdfDocument method returning page dimensions - QPdfDocument.pagePointSize().

topleft = QPointF(0, 0)
size = qdoc.pagePointSize(0)
bottomright = QPointF(size.width(), size.height())
print(qdoc.getSelection(0, topleft, bottomright).text()) # finds no text

I tried it also in C++, in order to rule out a shortcoming in the SIP wrapper. The result is the same as in Python. getAllText() finds text, getSelection().text() doesn't.

#include <iostream>
#include <QPdfDocument>

int main()
{
  QPdfDocument qdoc;
  qdoc.load("test.pdf");

  std::cout << "Whole page:" << std::endl;
  std::cout << qdoc.getAllText(0).text().toStdString() << std::endl;

  std::cout << std::endl;

  QPointF topleft(0, 0);

  QSizeF size = qdoc.pagePointSize(0);
  QPointF bottomright(size.width(), size.height());

  std::cout << "Selection in size of the whole page:" << std::endl;
  std::cout << qdoc.getSelection(0, topleft, bottomright).text().toStdString() << std::endl;

  qdoc.close();

  return 0;
}
igneus commented 1 month ago

With bmjcode/qpageview#3 merged the text selection mostly works, but it has some shortcomings.