Closed IllustrisJack closed 2 weeks ago
Hi, sorry for taking too long to reply.
As title says the text layer selections could be improved which is currently for better or worse depending on the PDF. Can also be found in the example uploaded on your docs. I´d assume this comes from certain blank spaces being ignored/hard cut e.g. also for newline.
Are you using a chrome-based browser? this issue seems to be more related on how the browser "paint" the selection layer, in firefox the selection looks better:
Also, this library does not make any treatment to the text-layer it just use the pdf.js
api without changing the text items, maybe for chrome I could try to do some clean process but I am aware that it could break other things just for something aesthetic.
Hi, sorry for taking too long to reply.
As title says the text layer selections could be improved which is currently for better or worse depending on the PDF. Can also be found in the example uploaded on your docs. I´d assume this comes from certain blank spaces being ignored/hard cut e.g. also for newline.
Are you using a chrome-based browser? this issue seems to be more related on how the browser "paint" the selection layer, in firefox the selection looks better:
Also, this library does not make any treatment to the text-layer it just use the
pdf.js
api without changing the text items, maybe for chrome I could try to do some clean process but I am aware that it could break other things just for something aesthetic.
Thanks for the reply and no worries for taking long, I am sure all of this is in your free time! Yes I was using multiple chromium based browsers when testing this. I guess it is mostly for aesthetic reasons, so if it is too much work we can close this issue.
I guess it is mostly for aesthetic reasons, so if it is too much work we can close this issue.
Maybe I am not getting you, you said there are some blank spaces that are being ignored on selection but I do not see exactly where. Using the same example, the copy result on this text:
1. Introduction
Dynamic languages such as JavaScript, Python, and Ruby, are pop-
ular since they are expressive, accessible to non-experts, and make
deployment as easy as distributing a source file. They are used for
small scripts as well as for complex applications. JavaScript, for
example, is the de facto standard for client-side web programming
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. To copy otherwise, to republish, to post on servers or to redistribute
to lists, requires prior specific permission and/or a fee.
PLDI’09, June 15–20, 2009, Dublin, Ireland.
Copyright c© 2009 ACM 978-1-60558-392-1/09/06. . . $5.00
But this result is the same in all cases whether testing on the library docs or pdf.js demo page (even between chromium and firefox browser). Could I missing something?
I guess it is mostly for aesthetic reasons, so if it is too much work we can close this issue.
Maybe I am not getting you, you said there are some blank spaces that are being ignored on selection but I do not see exactly where. Using the same example, the copy result on this text:
1. Introduction Dynamic languages such as JavaScript, Python, and Ruby, are pop- ular since they are expressive, accessible to non-experts, and make deployment as easy as distributing a source file. They are used for small scripts as well as for complex applications. JavaScript, for example, is the de facto standard for client-side web programming Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. PLDI’09, June 15–20, 2009, Dublin, Ireland. Copyright c© 2009 ACM 978-1-60558-392-1/09/06. . . $5.00
But this result is the same in all cases whether testing on the library docs or pdf.js demo page (even between chromium and firefox browser). Could I missing something?
I think you can safely ignore my statement. Some pdf viewers also add newline as copyable spaces to retain the format. This is not the case with pdf.js so I probably got things confused!
First of all, thank you for this project! Great stuff.
As title says the text layer selections could be improved which is currently for better or worse depending on the PDF. Can also be found in the example uploaded on your docs. I´d assume this comes from certain blank spaces being ignored/hard cut e.g. also for
newline
.Additional context