Closed genusistimelord closed 2 years ago
Ok i have figured it out
the PDF is missing the xref which should be near the end of the file.
xref
0 1518
0000000000 65535 f
.........................more here truncating for readability
0028076214 00000 n
trailer
<</Root 8 0 R/Size 1518>>
Also the Root is wrong too in the new
Root in the new is
1518 0 obj
<</Root 8 0 R/Size 1519/W[1 4 2]/Index[1 9 11 13 25 14 40 19 60 19 80 13 94 28 123 13 137 14 152 19 172 19 192 13 206 22 229 1 233 1 235 2 238 4 243 27 271 57 329 18 348 17 366 18 385 45 431 18 450 29 480 21 502 64 567 18 586 43 630 18 649 43 693 18 712 45 758 18 777 43 821 18 840 31 872 18 891 31 923 18 942 31 974 18 993 17 1011 18 1030 17 1048 18 1067 22 1090 18 1109 22 1132 18 1151 36 1188 18 1207 17 1225 18 1244 31 1276 18 1295 57 1353 18 1372 22 1395 124]/Length 10199>>stream
ó # · º 3 ‡ É ú ¦ i  l - - <> =r =‰ >G ?’ ?à @z @ß A= A– ˜§ Q Ô ê ° ±< ±S ² ³Œ ³½ ´L Ç V ‡ ¸ o Ô 2 ‹ ª ®Á ³~ ³• ³ Ãç Ãþ ļ Å÷ Æ( ÆÔ Ç9 Ç— Çð Ìš Ðç Ðþ Ѫ Õ¿ ÕÕ Ö Ù½ ÙÓ èñ ê% ê< êú ì ì² í^ íà î! îz ó$ ý] ýt ’
Æ
Ý › æ Ä ) ‡ à ‹ #ü $ $ 2¯ 2Ç 3u AŠ A¢ BP Pr PŠ Q8 [e [} \+ i— i¯ xÎ z z zÛ |K |} }0 }– }õ ~O ‚û ’Ë ’ã ¢ £7 £O ¤ ¥G ¥y ¦8 ¦ž ¦ý §W þi
™
°
Ï
Ü
D
v
k… l mG my n8 nž ný oW Ö ‚ < T (s )¨ )À *€ +¨ +Ú , ,ó -R -¬ 2X 6w 6Ž 7A ;L ;c < ?U ?l N‹ OÀ OØ P˜ R R> Rñ SW S¶ T X¼ bû c r2 sg s t? uw u© v\ v w! w{ |' C [ Ž žJ žb Ÿ ®Ï ®ç ¯š ¾ç ¾ÿ Î ÏS Ïk Ð+ ч Þ ß& ßî àó áJ %µ ' 6% 6k 7x € € €¦ : l ‚7 ‚ ‚ü St SÎ ˜n ¡È ¡à ¢“ ©v ©Ž ªA # : ¼Y ½Ž ½¦ ¾f ¿z ¿¬ À‘ À÷ ÁV ‘Î ’( “ §S ûf - D , !´ù #Ò! $ß2 %ÿ &5 &L &4 ' ('‹ )i€ *°I *³W *³n *´V +õä - .Ô[ 10¼ 13É 13à 14È 3 r 4e€ 5vO 6añ 6e= 6eT 6f< 7ÿ
8½@ 9èí <.[ <1Œ <1£ <2q >'' >'Ý >*t >*‹ >9ª >:ß >:÷ >;· ><Ó >= >=Ð >>6 >>• ?
~ ?Mä ?N> ?Rê ?V™ ?V° ?Wc ?ZF ?Z] ?i| ?j± ?jÉ ?k‰ ?l ?lÏ ?m´ ?n ?ny @:b @:¼ AjÖ B¦½ CÜ€ Cá, CäB CäY Cóx Cô CôÅ Cõ… Cö¡ CöÓ C÷ž Cø Cøc DŽ¿ D E9? E=ë EB@ EBX EC EJT EJl EY‹ EZÀ EZØ E[˜ E\¬ E\Þ E]à E^) E^ˆ Eôä Eõ> G!\ H Jz] J J‚ J‚1 Jƒ MÈJ P2- P¾< R? RB RB& RC S$ V) XŸm Z&! Z)6 Z)M Z*5 [ [æ÷ ]á” ^£D ^¦° ^¦Ç ^§¯ a/ì d< eËÅ eÌ{ eÏm eÏ„ eÞ£ eßØ eßð eà° eáÌ eáþ eâÉ eã/ e㎠f¢ f¢÷ fæ¿ fëk fïÕ fïí fð föq fö‰ g¨ gÝ gõ gµ gÉ gû g à g
F g
¥ gÉ´ gÊ hù: jKà kI kN- kQX kQo kRW l`• n°¸ pÝT q–š q™£ q™º qšˆ tŠ« v ¶ vh v vž vÓ vë v« vÇ vù vÄ v* v‰ wCn wCÈ wˆú w¦ w’u w’ w“@ wš¨ wšÀ w›s wžk wž‚ w¡ w®Ö w®î w¯® w°Â w°ô w±Ù w²? w²ž xÞƒ xÞÝ yó+ {!C |É |ÎI |ÑZ |Ñq |ÒY }Ñg ~—¡ à9 €’ €•/ €•F €–. §“ ƒHû …[Ì ‡^± ‡a´ ‡aË ‡b³ ŠUO ‹m< ?@ s v v0 w ’X“ “¹Â ”£/ •¨U •«j •« •¬i ˜! ˜Ä& š¸o œ6 œ90 œ9G œ:/ ÉC žïš " ¡g“ ¡jÎ ¡jå ¡k³ ¢ýÄ ¤+ ¤Ó ¤ê ¤. ¤/> ¤/V ¤0 ¤12 ¤1d ¤2/ ¤2• ¤2ô ¥ ¥Ù ¥` ¥d´ ¥j; ¥jS ¥k ¥q ¥q) ¥€H ¥} ¥• ¥‚U ¥ƒi ¥ƒ› ¥„€ ¥„æ ¥…E ¦mÐ ¦n* §õ¶ ©bË ªÙ ªÞY ªák ªá‚ ªâj ¶¾ °{+ ³6¥ ´Óü ´Öó ´×
´×ò ¶¹% ¹fÇ »@Ð ½%- ½(Y ½(p ½)X ¾.r ¿C Á<\ Ã^Û Ãb Ãb3 Ãc Ŷõ Å·« źB źY ÅÉx ÅÊ ÅÊÅ ÅË… ÅÌ¡ ÅÌÓ ÅÍž ÅÎ ÅÎc ƨr Æ¨Ì ÆìU Æñ ÆõÌ Æõä Æö— Æý Æý* ÇI Ç
~ Ç
– ÇV Çj Çœ Ç Çç ÇF ÇëU Çë¯ Êi ËàŒ Í“ Í—® ÍšÓ Íšê Í›Ò Îð7 Ðó Ño< ÓÁ( ÓÄW ÓÄn ÓÅV ÕÉà ×*¤ Øæ‚ Ú~à Úï Ú‚ Ú‚î ÜVº Þˆ àn» âLÖ âP âP âPî ä?Î ä@„ äC äC7 äRV äS‹ äS£ äTc äU äU± äV| äVâ äWA å˜ åò å]Ø åb„ ågc åg{ åh. åo åo— å~¶ åë å€ å€Ã å× å‚ å‚î åƒT 僳 æF
æFd çN 肺 é4X é9 é< é<0 é= êI… ëm ì)4 ïS² ïW ïW% ïX
ñ:V òÒÝ óè“ õ
K õ` õw õ_ ör ÷šÍ ùüƒ úËI úÎz úΑ úÏy üv² ý ÿì ›› žÉ žà ÿ ¯4 ¯L ° ±( ±Z ²% ²‹ ²ê Ä– Äð
Ç í ¸ ’ ª )É *þ + +Ö ,ê - . .g .Æ @r @Ì W% ]^ ', +Ø / /0 0
àâ ÊM Ñš ÔÂ ÔÙ ÕÁ
À ¼t Õz ž Æ Ý Å ž¶ –& öK Ô{ ך ×± Ø Èã ‰™ Œ] Œt ›“ œÈ œà ž¼ žî Ÿ¹ ~ P¹ Q ”Ú ™† ŸX Ÿp # ¥© ¥Á ´à ¶ ¶- ¶í ¸ ¸3 ¹ ¹~ ¹Ý j jr _0 —¯ F J½ MÖ Mí NÕ là ôU µ {c ~ ~¤ Œ !Ç "ˆ@ #"ö %Hh %K› %K² %ZÑ %\ %\ %\Þ %]ú %^, %^÷ %_] %_¼ &¿ & &^ß &c‹ &hÿ &i &iÊ &oŒ &o¤ &~à &ø &€ &€Ð &ä &‚ &‚û &ƒa &ƒÀ '>à '? (]ë )¦W +.ì +3˜ +6Ú +6ñ +7Ù ,¢W -’0 /G 0 £ 0£Ê 0£á 0¤É 2IÕ 3Ñ 4‹g 6/
62 627 6AV 6B‹ 6B£ 6Cc 6D 6D± 6E| 6Eâ 6FA 7u 7Ï 7HA 7Lí 7Rz 7R’ 7SE 7Y2 7YJ 7hi 7iž 7i¶ 7jv 7kŠ 7k¼ 7l¡ 7m 7mf 8+š 8+ô 9#’ :0 ;™ ;E ;f ;} ;e =ŽÄ ?Ïg @Ü0 Aƒ– A†® A†Å A‡ BÐF D¤Å Eÿ EÿÑ Fë F F! FV Fn F. FJ F| FG F F F€î FH FÄÈ FÉt FÎÑ FÎé FÏœ FÓ[ FÓr Fâ‘ FãÆ FãÞ Fäž Få² Fåä FæÍ Fç3 Fç’ GQt GQÎ HC I* K'á K, K/³ K/Ë K>ë K@! K@: K@ü KB KBL KC KC‡ KCç KÿÉ L $ LC LHN LM] LMv LN0 LRÆ LRß Laÿ Lc5 LcN Ld Le% LeX LfH Lf¯ Lg M"ñ M#L M¢ä OD PÆ„ PË2 PÎw PÎ Pݯ PÞå PÞþ PßÀ PàÝ Pá Páä PâK Pâ« Pãb Pã½ Q' Q+À Q0 Q00 Q0ê Q3² Q3Ê QBê QD QD9 QDû QF QFC QG3 QGš QGú R Rp TdÌ UH“ W1Z W6 W9= W9U W:, Y5ª Y6a Y8ô Y9 YH, YIb YI{ YJ= YKZ YK YLa YLÈ YM( ZQè Z•L Z•§ ZšU Zž( Zž@ Zžú Z¡Ý Z¡õ Z± Z²K Z²d Z³& Z´; Z´n Zµ^ ZµÅ Z¶% [ºå [»@ \¾Ô ]±& ^… ^‰Ä ^Œç ^Œÿ ^Ö _H\ _I _K¯ _KÇ _Zç _\ _\6 _\ø _^ _^H __ __ƒ __ã `w? `wš `» `¿¶ `Å" `Å; `Åõ `Ë® `ËÇ `Úç `Ü `Ü6 `Üø `Þ
`Þ@ `ß0 `ß— `ß÷ a÷S a÷® c
µ d;œ e+… e03 e3J e3b e4U fJõ gJr haP iNº iQÕ iQí iRà j’Û kvî l‘ mÕ mØ9 mØQ mÙ( o: p|¾ p{ p“ pŽ³ pé p pÄ p‘á p’ p’è p“O p“¯ qBÝ qC8 q†š q‹H qµ qÎ q‘ˆ q• q•§ q¤Ç q¥ý q¦ q¦Ø q§í q¨ q© q©w q©× rY rY` rêà sú“ t¿ì tÄš tÇÔ tÇì t× tØB tØ[ tÙ tÚ: tÚm tÛA tÛ¨ tÜ uÉš uÉõ v
¯ v] v v vÚ vº vÓ v,ó v.) v.B v/ v0 v0L v1< v1£ v2 w• wð xiì yÊ“ zŠ³ za z’‡ z’Ÿ z“’ |úÛ ` €Ï: ‚1« ‚4® ‚4Æ ‚5¹ ƒ¤ò „>Ö …¡G …¡þ …¥ …¥, …´L …µ‚ …µ› …¶] …·z …· …¸ …¸è …¹H †—¢ †—ý †Ûõ †à£ †å• †å® †æh †íá †íú †ý †þP †þi †ÿ+ ‡ @ ‡ s ‡c ‡Ê ‡* ‡à„ ‡àß ˆŸ? ‰Æ Šò Š÷> ŠúX Šúp Šûc Œ1u j³ Ž*Ó NŸ Qã Qû Rî ‘& “X ”,l –Ös –Ù¡ –Ù¹ –Ú¬ ˜:m ™±‹ ›ã ›×p ›Ú¶ ›ÚÎ ›ÛÁ œû“ ž žÕæ Ÿë³ Ÿî× Ÿîï Ÿïâ Ð{ ¡ºD £m ¤¬* ¤¯n ¤¯† ¤°] ¥Ø$ ¦ø ¦úÀ ¦úØ § ø §. §G § §
& §
Y §- §” §ô §Ú’ §Úí ¨O ¨"ý ¨(6 ¨(O ¨) ¨-> ¨-W ¨<w ¨= ¨=Æ ¨>ˆ ¨? ¨?Ð ¨@À ¨A' ¨A‡ ©
% ©
€ ©½Ò ªH «B «ð «* «B « «šg ¬8C ¬:ý ¬; ¬J5 ¬Kk ¬K„ ¬LF ¬Mc ¬MŸ ¬N ¬NG ¬NÏ ¬Nú ¬Ol ¬O— ¬P ¬P> ¬PÅ ¬Pñ ¬Qt ¬Q ¬R( ¬RT ¬S ¬S/ ¬SŽ ¬Sº ¬T0 ¬T\ ¬Tê ¬U ¬U ¬U¬ ¬V. ¬V[ ¬Vã ¬W ¬W“ ¬WÀ ¬XE ¬Xr ¬Xè ¬Y ¬Y” ¬YÁ ¬Z‹ ¬Z¸ ¬[" ¬[O ¬[Ð ¬[ý ¬\v ¬\£ ¬]% ¬]R ¬]Ê ¬]÷ ¬^x ¬^¥ ¬_ ¬_J ¬_Ë ¬_ø ¬`p ¬` ¬a ¬aK ¬aà ¬að ¬bq ¬bž ¬c ¬cC ¬cÄ ¬cñ ¬di ¬d– ¬e ¬eD ¬e¼ ¬eé ¬fj ¬f— ¬g ¬g< ¬g½ ¬gê ¬hc ¬h ¬i ¬i? ¬i¸ ¬iå ¬jg ¬j” ¬k
¬k: ¬k¼ ¬ké ¬lb ¬l ¬m ¬m@ ¬m¹ ¬mç ¬ni ¬n— ¬o ¬o> ¬oÀ ¬oî ¬pg ¬p• ¬q ¬qE ¬q¾ ¬qì ¬rn ¬rœ ¬s ¬sC ¬sÅ ¬só ¬tl ¬tš ¬u ¬uJ ¬uà ¬uñ ¬ve ¬v“
endstream
endobj
the old is just
<</Root 8 0 R/Size 1518>>
after the xref data.
So the break Occurs in commit a3f531b PR by @ralpha
@J-F-Liu I would Yank the 0.28.0 release from crates.io till this issue is fixed.
This causes Adobe PDF reader, and Envice to not load the pdf displaying a error message that it is broken due to the missing data needed. But works fine in PDF readers that have no features at all.
@genusistimelord could you provide me with some pdf file that I could use for testing?
Not all PDFs need a Cross Reference Table they can also have a Cross Reference Stream. And It looks like this is what is being used here.
Before the commit lopdf
could only write a Cross Reference Table. In the PR I created to possibility of a Cross Reference Stream.
The PDF being loaded will keep the way it was encoded, a Table if it was a Table and a Stream if it was a Stream.
In the update is set the default (when using Document::new()
) to the Stream. (Maybe we want to change the default)
https://github.com/J-F-Liu/lopdf/blob/850b150461245cbf7c8dd780b31c76837769a0f5/src/document.rs#L57
(this is a new (since PDF 1.5, in 2003) and more compact way of storing the info and support some other features.)
You can set the type you want to use using:
document.reference_table.cross_reference_type = XrefType::CrossReferenceTable;
// or
document.reference_table.cross_reference_type = XrefType::CrossReferenceStream;
But I don't think this is the real issue here. I think something else went wrong.
Maybe the problem is because you are linearizing the PDF, but yet the object id's are not sequential.
I don't know how many file you are merging together but each gap in the object id is exactly 1 (except for one, that claims to have 3 objects, file number 14, which has 1 object in it).
But if you happen to be merging 60 files together and using the merge.rs
example code. Maybe removing the + 1
on this line will partly solve the problem.
https://github.com/J-F-Liu/lopdf/blob/850b150461245cbf7c8dd780b31c76837769a0f5/examples/merge.rs#L75
Although I think that the file with 3 object, but actually 1, will still give you the same error.
But I think we need to add a way for linearized files to not skip over missing Object Id's. (btw this is wanted behavior in incremental PDFs, hence the reason it does this right now)
If you could provide me with some file I could make sure this is the issue and fix it.
these are PDF files generated with SSRS. The plus one should not be an issue since he removed one from the count https://github.com/J-F-Liu/lopdf/blob/master/src/reader.rs#L170 which means it is set to the max_id == last_id. so you need to add 1 in order to reorder the id's for proper merging.
here are the files i tested to ensure they do break when merged witht he current setup. PDF.zip
also the way i currently merge the pdfs are in this PR i canceled because i thought it was a issue i caused and did not know it was a issue with a previous PR.
@ralpha I have tested the document.reference_table.cross_reference_type = XrefType::CrossReferenceTable; though I had to make a change to allow xref to be public so I could set the option. Also we should add a Document Version number check for when its set to XrefType::CrossReferenceStream making sure people can only use pdf version 1.5 and greater.
@J-F-Liu Thank you for yanking the current crate for now. that will prevent further issues till @ralpha can get the streaming side fixed. Also please take a look at my PR changes when you get a chance. As i make xref public and have updated the match statement in expand and the merge.rs example.
Currently don't have time to fix it this week, but will take a look soon.
@genusistimelord when you changed the Xref table make sure that in:
Index[1 9 11 13 25 14 40 19 ...]
is shorter (so ObjId
is sequential)
Index is list of pairs, 1 9
means "starting at id 1 there are 9 objects".0 1518
lines (just 2 int values).
They should be similar in both cases and have similar numbers.
Same numbering is used here 0 1518
means "starting at id 0 there are 1518 objects".The objects are then listed below it in both cases, ether in stream or table.
So I was updating and trying out 0.28.0 of lopdf the newest update and noticed when i merge a ton of pdf's together that I use to merge with 0.27 that it now is not a workable PDF. IDK exactly what caused the issue but i can try and look at the changes to see what it could be. But this is a major problem if it can no longer merge adobe PDF files together.
I also saw a noticeable size difference between the PDFs as well will diff them to see what might of changed. old 27,448kb new 27,433kb