christian-vigh-phpclasses / PdfToText

Extracts text from PDF files
Other
124 stars 92 forks source link

Question Text coordinates #3

Closed Johann-S closed 8 years ago

Johann-S commented 8 years ago

Hi, First of all thank you for your awesome library.

My question is, How can I retrieve coordinate of a seached text in a PDF with PdfToText ?

christian-vigh-phpclasses commented 8 years ago

Hello Johann,

Thanks for your greetings !

What kind of coordinates do you want to retrieve ? page number ? line and character position ? graphical position on a page ?


De : Johann [mailto:notifications@github.com] Envoyé : mardi 5 juillet 2016 11:03 À : christian-vigh-phpclasses/PdfToText Objet : [christian-vigh-phpclasses/PdfToText] Question Text coordinates (#3)

Hi, First of all thank you for your awesome library.

My question is, How can I retrieve coordinate of a seached text in a PDF with PdfToText ?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it https://github.com/christian-vigh-phpclasses/PdfToText/issues/3 on GitHub, or mute https://github.com/notifications/unsubscribe/ARM8aoPXsJ8hiS14PtyX7NCvDFxjcl I3ks5qSh3ZgaJpZM4JE6W0 the thread. https://github.com/notifications/beacon/ARM8anyY7uTBpSqxkibw7VLH6_v9jocnks5 qSh3ZgaJpZM4JE6W0.gif


L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus

Johann-S commented 8 years ago

I want to retrieve graphical position on a page for a searched text

christian-vigh-phpclasses commented 8 years ago

It is not yet implemented (the original goal of this class was simply to extract text from pdf files).

Currently I’m not tracking graphical coordinates ; I’m just processing text drawing instructions in the order they arrive, hoping that they arrive in the correct order (which is not the case sometimes…).

I plan in the future to write a version 2 of my class that will try to approximate the graphical page layout into an ascii page layout, which will be a tough task. But you’ll have to be patient for that… However, in that context, it will make sense that this version 2 keeps track of graphical text coordinates, so I add it to my todo list.


De : Johann [mailto:notifications@github.com] Envoyé : mardi 5 juillet 2016 11:15 À : christian-vigh-phpclasses/PdfToText Cc : christian-vigh-phpclasses; Comment Objet : Re: [christian-vigh-phpclasses/PdfToText] Question Text coordinates (#3)

I want to retrieve graphical position on a page for a searched text

— You are receiving this because you commented. Reply to this email directly, view https://github.com/christian-vigh-phpclasses/PdfToText/issues/3#issuecommen t-230427504 it on GitHub, or mute https://github.com/notifications/unsubscribe/ARM8aguRXkkBEKeYJSnE75_Pkir16V byks5qSiCvgaJpZM4JE6W0 the thread. https://github.com/notifications/beacon/ARM8aohjTxZeh2HwzWKlMD-s1z3K3Yg0ks5 qSiCvgaJpZM4JE6W0.gif


L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus

Johann-S commented 8 years ago

Ok thank you very much for your quick answer :+1: