Closed AayushSameerShah closed 11 months ago
Thank you for using docx2python.
There is not currently a way to suppress hyperlink extraction. I will think about how this might be accomplished simply. For now, I would recommend a regex to strip away the html tags.
I’ve had several requests for page count / page numbers. Unfortunately, word does not store page numbers or page breaks. These are assigned dynamically when the page renders, so there is no way to know the page count without re-implementing the hundreds-of-pages-long docx rendering specification.
Sent from my iPhone
On Dec 6, 2023, at 01:10, Aayush Shah @.***> wrote:
I really have found this library useful 🙏🏻
1️⃣
Can I disable hyperlink extraction? If I want the link text itself, can I just disable this feature? Like: Instead of link text I just want link text.
2️⃣
Can I get the number of pages? WITHOUT reading whole document? I want the user to submit the doc which is only 5 pages long say, then without loading all contents, can I get the pages length?
Please guide, thanks.
— Reply to this email directly, view it on GitHubhttps://github.com/ShayHill/docx2python/issues/48, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADAKIE4V3JYGKKYGFDT7PUTYIAK6VAVCNFSM6AAAAABAI4UVOOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAZDOOBRGU2DAMI. You are receiving this because you are subscribed to this thread.Message ID: @.***>
@ShayHill Thank you so much 😄 ✌🏻
I really have found this library useful 🙏🏻
Question 1️⃣
Can I disable hyperlink extraction? If I want the link text itself, can I just disable this feature? Like: Instead of
<a href="http:/...">link text</a>
I just wantlink text
.Question 2️⃣
Can I get the number of pages? WITHOUT reading whole document? I want the user to submit the doc which is only 5 pages long say, then without loading all contents, can I get the pages length?
Please guide, thanks.