sabre-io / xml

sabre/xml is an XML library that you may not hate.
http://sabre.io/xml/
BSD 3-Clause "New" or "Revised" License
516 stars 77 forks source link

[question] parsing child elements which don't have parent namespace #189

Closed waspinator closed 4 years ago

waspinator commented 4 years ago

I'm trying to parse an XML file with multiple namespaces. These are only applied to some parent-level elements, but not to their children. How would I define a service to accommodate this type of structure?

Using the Atom example as a template,

class Service extends \Sabre\Xml\Service
{
    const PESC_NS = 'urn:org:pesc:message:AcademicRecordBatch:v2.0.0';
    const PESC_DEFAULT_PREFIX = 'pesc';

    public function __construct()
    {
        $this->namespaceMap[self::PESC_NS] = self::PESC_DEFAULT_PREFIX;

        $pesc = '{'.self::PESC_NS.'}';

        $this->mapValueObject($pesc.'AcademicRecordBatch', Element\AcademicRecordBatch::class);
        $this->mapValueObject($pesc.'BatchEnvelope', Element\BatchEnvelope::class);
    }
}

And creating a couple new classes

class AcademicRecordBatch
{
    public $BatchEnvelope;
    public $BatchContent;
}

class BatchEnvelope
{
    public $BatchID;
}

The parsed object doesn't have data from the children

object(\AcademicRecordBatch) {
    BatchEnvelope => null
    BatchContent => null
}

Example XML:

<?xml version="1.0"?>
<AcRecBat:AcademicRecordBatch xmlns:AcRec="urn:org:pesc:sector:AcademicRecord:v1.4.0" xmlns:AcRecBat="urn:org:pesc:message:AcademicRecordBatch:v2.0.0" xmlns:core="urn:org:pesc:core:CoreMain:v1.5.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:org:pesc:message:AcademicRecordBatch:v2.0.0 AcademicRecordBatch_v2.0.0.xsd">
  <BatchEnvelope>
    <BatchID>123</BatchID>
    <SourceAgency>
      <Organization>
        <MutuallyDefined>ABC</MutuallyDefined>
        <OrganizationName>BCD</OrganizationName>
      </Organization>
    </SourceAgency>
  </BatchEnvelope>
  <BatchContent>
    <ColTrn:CollegeTranscript xmlns:AcRec="urn:org:pesc:sector:AcademicRecord:v1.7.0" xmlns:ColTrn="urn:org:pesc:message:CollegeTranscript:v1.4.0" xmlns:core="urn:org:pesc:core:CoreMain:v1.12.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:org:pesc:message:CollegeTranscript:v1.4.0 CollegeTranscript_v1.4.0.xsd">
      <TransmissionData>
        <DocumentID>234</DocumentID>
      </TransmissionData>
      <Student>
        <Person>
          <SchoolAssignedPersonID>345</SchoolAssignedPersonID>
          <Name>
            <FirstName>John</FirstName>
            <LastName>Doe</LastName>
          </Name>
        </Person>
        <AcademicRecord>
          <AcademicSession>
            <AcademicSessionDetail>
              <SessionDesignator>2019-09</SessionDesignator>
            </AcademicSessionDetail>
            <Course>
              <CourseCreditValue>3</CourseCreditValue>
            </Course>
            <Course>
              <CourseCreditValue>4</CourseCreditValue>
            </Course>
            <AcademicSummary>
              <AcademicSummaryType>Cumulative</AcademicSummaryType>
              <GPA>
                <GradePointAverage>90</GradePointAverage>
              </GPA>
            </AcademicSummary>
          </AcademicSession>
          <AcademicSession>
            <AcademicSessionDetail>
              <SessionDesignator>2020-09</SessionDesignator>
            </AcademicSessionDetail>
            <AcademicSummary>
              <AcademicSummaryType>Cumulative</AcademicSummaryType>
              <GPA>
                <GradePointAverage>91</GradePointAverage>
              </GPA>
            </AcademicSummary>
          </AcademicSession>
        </AcademicRecord>
        <NoteMessage>note</NoteMessage>
      </Student>
    </ColTrn:CollegeTranscript>
  </BatchContent>
</AcRecBat:AcademicRecordBatch>
evert commented 4 years ago

Unfortunately the keyValue deserializer will only map elements from 1 namespace to the object, and ignore the rest.

However, since BatchEnvelope and BatchContent are in the same namespace, you can still manually hook up the mapping in each direction yourself.

Here's where mapValueObject registers the Deserializer:

https://github.com/sabre-io/xml/blob/master/lib/Service.php#L247

You have access to the $elementMap, so if you set the valueObject deserializer yourself, you can give it a different third argument for the namespace.

Hope this helps!

waspinator commented 4 years ago

So what would I change to the namespace to ignore the AcRecBat part of xmlns:AcRecBat? Children aren't included in the Deserializer probably because they aren't part of the same namespace?

Playing around with the Atom example, and changing

<feed xmlns="http://www.w3.org/2005/Atom">

to

<AcRecBat:feed xmlns:AcRecBat="http://www.w3.org/2005/Atom">

breaks the Deserializer. Adding the AcRecBat: to simple elements makes them work again

example:

<AcRecBat:title type="text">dive into mark</AcRecBat:title>

but adding them to object elements does not.

example

<AcRecBat:entry>

still returns []

waspinator commented 4 years ago

for now simplexml_load_string seems to do the job.

$xml_object = simplexml_load_string($xml_string);

$batch_envelope_object = $xml_object->BatchEnvelope;
$college_transcript_object = $xml_object->BatchContent->children('urn:org:pesc:message:CollegeTranscript:v1.4.0')->children();