mozilla / pdf.js

PDF Reader in JavaScript
https://mozilla.github.io/pdf.js/
Apache License 2.0
48.34k stars 9.97k forks source link

Texts show up in the wrong place and overlapped over other texts. #4841

Closed takahiroyoshi closed 4 years ago

takahiroyoshi commented 10 years ago

http://www.kccs.co.jp/ict/mobile-pilina/leaflet_pilina_wimax_201312.pdf Open the second page.

wrong_place

Firefox 29.0 on Windows 8.1 I tested it at: http://mozilla.github.io/pdf.js/web/viewer.html

Snuffleupagus commented 10 years ago
"PDF 9a3c62e0cf369541a2bf865b73edb4b [1.5 Adobe PDF library 9.90 / Adobe Illustrator CS5.1] (PDF.js: 1.0.248 [WebGL])"

The issue here is that the file contains layers, and that is what it looks like when all of them are visible simultaneously. Duplicate of #3281.

yurydelendik commented 10 years ago

Mac OSX Preview produces the same result:

screen shot 2014-05-26 at 10 44 38 am

richerm commented 8 years ago

We ran into a similar issue with hidden layers in the PDF, properly not showing in Adobe Acrobat but visible in PDF.JS.

I'm not familiar with the code base to attempt a fix but I did take some time to look at PDFs to determine where that information is stored using the PDF Object Browser (http://brendandahl.github.io/pdf.js.utils/browser/).

1) Root has OCProperties/D dictionaries for ON/OFF layers. Layers which are visible are listed under "ON" layers that are hidden are listed under "OFF".

2) Those layers in step 1 are in the OCGs array under OCProperties which includes the layer name and the 'intent array'. Note this intent array ID.

3) For a given page, under Resources/Properties there are references to MC0-MCX which has the layer name and the intent array which refers to the same id found in #2.

4) Under "Contents (stream)" each resource (in the content itself) refers to the layer it corresponds to: "/OC /MC0 BDC" (i.e. MC0).

Granted I don't have the Adobe spec to confirm that is sufficient but that should point in the right direction. With the above information, you can determine which resources should (and should not) be displayed.

Hope that helps point someone in the right direction for a fix!

dionatanaraujo commented 7 years ago

@richerm thanks for shared your experience!

It helped a lot.

Below is the code generated for solve the similar problem with "hidden layers".

var _hiddenLayers = resources.xref.root.map.OCProperties.map.D.get('OFF');
for (var _xx = 0; _xx < _hiddenLayers.length; _xx++) {
     // _propName is key that you want remove. It's possible use another.
     var _propName = resources.xref.fetchIfRef(_hiddenLayers[_xx]).get('Name');
     if (dict.get(_propName)) {
    return true;
     }
}

I hope it helps other people.

Snuffleupagus commented 4 years ago

Duplicate of #269