uniquejava / blog

My notes regarding the vibrating frontend :boom and the plain old java :rofl.
Creative Commons Zero v1.0 Universal
11 stars 5 forks source link

java xml #163

Open uniquejava opened 6 years ago

uniquejava commented 6 years ago

可选方案: jdk自带的DOM API, JDK自带的SAX API, jdom, dom4j, jaxb, xstream, XMLDog

!保存xml时记得转义特殊字符: 转义 StringEscapeUtils.escapeXml10(...)

Books:

Java and XML, 3rd Edition by Justin Edelson; Brett McLaughlin Published by O'Reilly Media, Inc., 2006

如何选型? https://docs.oracle.com/cd/E19316-01/819-3669/6n5sg7bni/index.html

oracle官方文档: https://docs.oracle.com/cd/E19316-01/819-3669/index.html

DOM based

许多年不写java, 只记得有dom, sax(event based), 还有个JAXB规范.

和许多年前不同的地方是 JDK自带这些玩艺了. 抽象出来的代码如下, 依然是那么复杂...

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(new ByteArrayInputStream(content.getBytes("utf-8"))));
// Create XPathFactory object
XPathFactory xpathFactory = XPathFactory.newInstance();

// Create XPath object
XPath xpath = xpathFactory.newXPath();

// create XPathExpression object
XPathExpression expr = xpath.compile("/a/b/c/text()");

// evaluate expression result on XML document
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
return nodes;

// evaluate expression result on XML document
return (String) expr.evaluate(doc, XPathConstants.STRING);

Sax based

可回调的事件列表 http://www.saxproject.org/apidoc/org/xml/sax/ContentHandler.html

参考:

https://stackoverflow.com/questions/2811001/how-to-read-xml-using-xpath-in-java https://stackoverflow.com/questions/1706493/java-net-malformedurlexception-no-protocol

uniquejava commented 5 years ago

StAX based (Streaming API for XML))

StAX is a pull API. SAX is a push API. StAX can do both XML reading and writing. SAX can only do XML reading.

见: Java Read XML with StAX Parser – Cursor & Iterator APIs

What Is StAX?

Streaming API for XML (StAX) is a Java API for parsing an XML document sequentially from start to finish and also for creating XML documents. StAX was introduced by Java 6 as an alternative to SAX and DOM and is located midway between these “polar opposites.”

StAX Versus SAX and DOM

Because Java already supports SAX and DOM for document parsing and DOM for document creation, you might be wondering why another XML API is needed. The following points justify StAX’s presence in core Java:

  1. StAX (like SAX) can be used to parse documents of arbitrary sizes. In contrast, the maximum size of documents parsed by DOM is limited by the available memory, which makes DOM unsuitable for mobile devices with limited amounts of memory.

  2. StAX (like DOM) can be used to create documents. In contrast to DOM, which can create documents whose maximum size is constrained by available memory, StAX can create documents of arbitrary sizes. SAX cannot be used to create documents.

  3. StAX (like SAX) makes infoset items available to applications almost immediately. In contrast, these items are not made available by DOM until after it finishes building the tree of nodes.

  4. StAX (like DOM) adopts the pull model, in which the application tells the parser when it’s ready to receive the next infoset item. This model is based on the iterator design pattern (see http://sourcemaking.com/design_patterns/iterator ), which results in an application that’s easier to write and debug. In contrast, SAX adopts the push model, in which the parser passes infoset items via events to the application, whether or not the application is ready to receive them. This model is based on the observer design pattern (see http://sourcemaking.com/design_patterns/observer ), which results in an application that’s often harder to write and debug.

Summing up, StAX can parse or create documents of arbitrary size, makes infoset items available to applications almost immediately, and uses the pull model to put the application in charge. Neither SAX nor DOM offers all of these advantages.

Java implements StAX through types stored in the javax.xml.stream, javax.xml.stream.events, and javax.xml.stream.util packages.

详见: https://learning.oreilly.com/library/view/java-xml-and/9781484243305/html/394211_2_En_4_Chapter.xhtml

uniquejava commented 5 years ago

官方示例

https://docs.oracle.com/cd/E19316-01/819-3669/bnbfl/index.html

SO上的示例

java use StAX to get children elements in a generic fashion: https://stackoverflow.com/questions/4264650/java-use-stax-to-get-children-elements-in-a-generic-fashion

示例, 从

<books>
    <book>....</book>
    ...
    <book>....</book>
</books>

<books>
   <book>
      <index></index>
      ....
   </book>
   ...
   <book>
      <index></index>
      ....
   </book>
</books>

代码

XMLInputFactory inFactory = XMLInputFactory.newInstance();
XMLEventReader eventReader = inFactory.createXMLEventReader(new FileInputStream("1.xml"));
XMLOutputFactory factory = XMLOutputFactory.newInstance();
XMLEventWriter writer = factory.createXMLEventWriter(new FileWriter(file));
XMLEventFactory eventFactory = XMLEventFactory.newInstance();
while (eventReader.hasNext()) {
      XMLEvent event = eventReader.nextEvent();
      writer.add(event);
      if (event.getEventType() == XMLEvent.START_ELEMENT) {
          QName qname = event.asStartElement().getName();
          String name = qname.getLocalPart();
          if (name.equals("book")) {
              writer.add(eventFactory.createStartElement("", null, "index"));
              writer.add(eventFactory.createEndElement("", null, "index"));
          }
      }
  }
writer.close();

Finally you can change FileWriter to new BufferedOutputStream(new FileInputStream(file)) and FileInputStream with new BufferedInputStream(new FileInputStream(file))

QName

Note: QName describes a qualified name as a combination of namespace URI, local part, and prefix components. After instantiating this immutable class (via a constructor such as QName(String namespaceURI, String localPart, String prefix)), you can return these components by calling QName’s String getNamespaceURI(), String getLocalPart(), and String getPrefix() methods.