radkovo / Pdf2Dom

Pdf2Dom is a PDF parser that converts the documents to a HTML DOM representation. The obtained DOM tree may be then serialized to a HTML file or further processed. A command-line utility for converting the PDF documents to HTML is included in the distribution package. Pdf2Dom may be also used as an independent Java library with a standard DOM interface for your DOM-based applications or as an alternative parser for the CSSBox rendering engine in order to add the PDF processing capability to CSSBox. Pdf2Dom is based on the Apache PDFBox™ library.
http://cssbox.sourceforge.net/pdf2dom/
GNU Lesser General Public License v3.0
175 stars 71 forks source link

Background colour of node in DOM #48

Closed ashishsharma0 closed 3 years ago

ashishsharma0 commented 3 years ago

Could you please let me know how can i get Background color of element/node in DOM

able to get below output style="top:161.80327pt;left:29.21pt;line-height:7.4866333pt;font-family:Arial;font-size:7.0pt;width:48.82689pt;"

using below code

XPath xPath = XPathFactory.newInstance().newXPath(); String expression = "//div[text()='A300-327-GE']"; // use your XPath expression here
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(dom, XPathConstants.NODESET); System.out.println("nodeList"+nodeList.item(0).getAttributes().item(2));

ashishsharma0 commented 3 years ago

Hi Team , could you please update on this ? Thanks, Kind regards, Ashish

radkovo commented 3 years ago

In this case, if there is no background declared in style, the element is probably transparent. Some of its parent elements may have a background assigned. For general DOM trees, it is necessary to compute the style for every element. This is not provided by Pdf2Dom but a similar problem is discussed for example here.

Anyway, it would be more suitable to use some general discussion platform for this kind of general questions such as StackOverflow. This is an issue tracker for Pdf2Dom and should be used for reporting issues only.