PhilGale92 / docx

PHP Based Docx Parser
MIT License
38 stars 19 forks source link

Link handling issues #48

Closed nosun closed 7 years ago

nosun commented 7 years ago

Hi , Phil Gale, thanks for your job, I'd very like your work, I use your docx package in one of my project, but now I found an issue.

If there is a paragraph with a link in the text, after processing, the link text will sometimes be discharged to the front, sometimes the string will be truncated.

I try to solve this problem, but this is a bit difficult for me. test.docx

Can you help me check this question?

thanks again.

nosun commented 7 years ago

Another problem relation to this , is that “href” attribute eventually be changed to the “inner html content”

PhilGale92 commented 7 years ago

Hmm ok I see the problems...

Part 1) The hyperlink was handled, then standard text runs. Aparently in all of my word files I parse through hyperlinks always came first so never came accross that one before! I have a fix prepared where it just deals with each DomElement in order... Will be sorted soon.

Part 2) The word parser handles links when the text is the same as the link... I've had a look and found how external links are handled though, Word aparently makes a new entry in the document.xml.rels for hyperlinks that i will need to pull data from.

Thanks for the report will have fixes in the repo in the next day or so.

nosun commented 7 years ago

Thanks very much ,I am not very anxious, waiting for you!