fuweichin / image-info-extractor

A JavaScript lib to read image info and to extract/parse image metadata
MIT License
4 stars 0 forks source link

RDF Structs do not get parsed correctly #1

Closed leifniem closed 8 months ago

leifniem commented 8 months ago

Hi,

i recently used your library in one of my projects and was noticing that nested RDF structs do not get parsed correctly.

For example the following raw XMP data:

   <crs:MaskGroupBasedCorrections>
    <rdf:Seq>
     <rdf:li>
      <rdf:Description
       crs:What="Correction"
       crs:CorrectionAmount="1"
       crs:CorrectionActive="true"
       ...>
      <crs:CorrectionMasks>
       <rdf:Seq>
        <rdf:li
         crs:What="Mask/CircularGradient"
         crs:MaskActive="true"
         ...
         crs:Version="2"/>
       </rdf:Seq>
      </crs:CorrectionMasks>
      </rdf:Description>
     </rdf:li>
    </rdf:Seq>
   </crs:MaskGroupBasedCorrections>

results in the following object:

    MaskGroupBasedCorrections: [
        "\\n      \\n      \\n       \\n        \\n       \\n      \\n      \\n     "
    ],

Another example would be the Look Struct, which just appears as null in the parsed version.

fuweichin commented 8 months ago

The problem has been fixed with commit 942c6e1e633725ea8dc4991c0e062524e4825060.

To parse tag values as number/boolean instead of string, you also need to extend the xmpTagTypeMap with vendor-specific presets, see example https://github.com/fuweichin/image-info-extractor/blob/main/examples/image-info-extractor.js#L8.

B.T.W. Those XMP tagTypeMap presets are collected from https://exiftool.org/TagNames/XMP.html, with help of DevTools Console script,

let extractFromTable = (tbody, prefix) => {
  let items = [];
  [...tbody.rows].forEach((tr,i)=>{
    if(i===0){
      return;
    }
    let {cells} = tr;
    let tagName = cells[0].textContent;
    let writable = cells[1].textContent;
    let m = /^(integer|boolean|real|rational)/.exec(writable);
    if(!m) {
      return;
    }
    let tagType = m[1];
    if(writable.endsWith('+')) {
      tagType+='[]';
    }
    items.push(`  '${prefix}:${tagName}': '${tagType}',`);
  });
  return items;
}

// console.log(extractFromTable(temp1, 'crs').join('\n'))

Try to make your own if you have further needs.

leifniem commented 8 months ago

Thanks a lot, will do :)

leifniem commented 8 months ago

It seems that your helper console script ignores string values of the table, i don't know if that is by accident or intentional.

fuweichin commented 8 months ago

xmpTagTypeMap is designed to specify types of non-string XMP tags when parsing, by default an XMP tag will be parsed as string.