Import existing XML metadata (EAD, MODS, etc) into a native JSON format for strawberryfield. This can be handy when dealing with external sources of migrations where we want to maintain existing data/schemas but cast into a more general JSON format to allow our webform system (https://github.com/esmero/webform_strawberryfield) to handle further editing/creation.
Problem
Given a simple XML like
<?xml version='1.0' standalone='yes'?>
<archdesc localtype="inventory" level="subgrp">
<did>
<head>Overview of the Records</head>
<repository label="Repository:">
<corpname>
<part>Minnesota Historical Society</part>
</corpname>
</repository>
<origination label="Creator:">
<corpname>
<part>Minnesota. Game and Fish Department</part>
</corpname>
</origination>
<unittitle label="Title:">Game laws violation records,</unittitle>
<unitdate label="Dates:">1908-1928</unitdate>
<abstract label="Abstract:">Records of prosecutions for and seizures of property resulting from violation of the state's hunting and fishing laws.</abstract>
<physdesc label="Quantity:">2.25 cu. ft. (7 v. and 1 folder in 3 boxes)</physdesc>
<physloc label="Location:">See Detailed Description section for box location</physloc>
</did>
</archdesc>
Would easily deal with XML to JSON and, if needed, to Array casting.
But:
For XML elements with @attributesand text values, JSON serializer will discard them totally ending in an array like
[unittitle] => Game laws violation records,
[unitdate] => 1908-1928
Solution
Deal with JSON serialization in the same way JSON-LD does using the @value key for the actual text value and a custom @attributekey or even a @typekey with a mapping @context that helps bring non semantic, from an XML schema coming, elements into an local context.
This implies:
1.- Build a decorator class for the JSON Serialization
2.- Subclass Simple XML Element Class
3.- Build a Composer aware PHP Library we can include in Strawberryfield
Potential Code and Discussion
This is a great way of dealing with XML and integrating our own code. This would allow us to also accommodate files already processed by other systems (migrate) or even be fed by external APIs and then cast via Twig to visualizations, index in our Solr, etc.
/**
* Class JsonLDSimpleXMLElementDecorator
*
* Implement JsonSerializable for SimpleXMLElement as a Decorator with JSON-LD syntax
*/
class JsonLDSimpleXMLElementDecorator implements JsonSerializable
{
const DEF_DEPTH = 512;
private $options = ['@attributes' => TRUE, '@text' => TRUE, 'depth' => self::DEF_DEPTH];
/**
* @var SimpleXMLElement
*/
private $subject;
public function __construct(SimpleXMLElement $element, $useAttributes = TRUE, $useValue = TRUE, $depth = self::DEF_DEPTH) {
$this->subject = $element;
if (!is_null($useAttributes)) {
$this->useAttributes($useAttributes);
}
if (!is_null($useValue)) {
$this->useValue($useValue);
}
if (!is_null($depth)) {
$this->setDepth($depth);
}
}
public function useAttributes($bool) {
$this->options['@attributes'] = (bool)$bool;
}
public function useValue($bool) {
$this->options['@value'] = (bool)$bool;
}
public function setDepth($depth) {
$this->options['depth'] = (int)max(0, $depth);
}
/**
* Specify data which should be serialized to JSON
*
* @return mixed data which can be serialized by json_encode.
*/
public function jsonSerialize() {
$subject = $this->subject;
$array = array();
// json encode attributes if any.
if ($this->options['@attributes']) {
if ($attributes = $subject->attributes()) {
$array['@attributes'] = array_map('strval', iterator_to_array($attributes));
}
}
// traverse into children if applicable
$children = $subject;
$this->options = (array)$this->options;
$depth = $this->options['depth'] - 1;
if ($depth <= 0) {
$children = [];
}
// json encode child elements if any. group on duplicate names as an array.
foreach ($children as $name => $element) {
/* @var SimpleXMLElement $element */
$decorator = new self($element);
$decorator->options = ['depth' => $depth] + $this->options;
if (isset($array[$name])) {
if (!is_array($array[$name])) {
$array[$name] = [$array[$name]];
}
$array[$name][] = $decorator;
} else {
$array[$name] = $decorator;
}
}
// json encode non-whitespace element simplexml text values.
$text = trim($subject);
if (strlen($text)) {
if ($array) {
$this->options['@value'] && $array['@value'] = $text;
} else {
$array = $text;
}
}
// return empty elements as NULL (self-closing or empty tags)
if (!$array) {
$array = NULL;
}
return $array;
}
Use would be
$xml = new SimpleXMLElement($ead);
$xml = new JsonLDSimpleXMLElementDecorator($xml, TRUE, TRUE, 3);
echo json_encode($xml, JSON_PRETTY_PRINT), "\n";
This will require that form elements allow/read/write the @attribute element, which can be generalized by the use of the custom JSON properties each Webform element can/could have.
Use Case
Import existing XML metadata (EAD, MODS, etc) into a native JSON format for strawberryfield. This can be handy when dealing with external sources of migrations where we want to maintain existing data/schemas but cast into a more general JSON format to allow our webform system (https://github.com/esmero/webform_strawberryfield) to handle further editing/creation.
Problem
Given a simple XML like
A
PHP
snippet of code likeWould easily deal with XML to JSON and, if needed, to Array casting.
But:
For XML elements with
@attributes
and text values, JSON serializer will discard them totally ending in an array likeSolution
Deal with JSON serialization in the same way JSON-LD does using the
@value
key for the actual text value and a custom@attribute
key or even a@type
key with a mapping@context
that helps bring non semantic, from an XML schema coming, elements into an local context.This implies: 1.- Build a decorator class for the JSON Serialization 2.- Subclass Simple XML Element Class 3.- Build a Composer aware PHP Library we can include in Strawberryfield
Potential Code and Discussion
This is a great way of dealing with XML and integrating our own code. This would allow us to also accommodate files already processed by other systems (migrate) or even be fed by external APIs and then cast via Twig to visualizations, index in our Solr, etc.
Use would be
This code is adapted (a few single lines change really) https://hakre.wordpress.com/2013/07/10/simplexml-and-json-encode-in-php-part-iii-and-end/ and its pretty cool!
Webform integration
This will require that form elements allow/read/write the
@attribute
element, which can be generalized by the use of the custom JSON properties each Webform element can/could have.