popbr / data-integration

Apache License 2.0
1 stars 4 forks source link

Attributes with the same name #17

Open MNSleeper opened 1 year ago

MNSleeper commented 1 year ago

In file formats like xml, pulling an element by a tag requires predefined knowledge of the format, and sometimes requires more in-depth parsing to get the correct tag.

For example:

<Award>
     <School>
         <Name> Augusta University</Name>
     </School>
     <Officer>
          <Name> John </Name>
     </Officer>
</Award>

In this sample file, to retrieve the Name attribute of School, you would either have to pull a node list of all children in , or just search for the first element with the Name tag. If we want the Name of Officer, then we have to pull the child node list of Officer.

This requires knowledge of the file format before parsing. This is an issue, because it means each file potentially needs special parsing instructions before it'll pull the data we want.