ErwinKomen / CrppServer

Corpus Research Project Processor - server
0 stars 0 forks source link

txtlist: add text meta information #1

Open ErwinKomen opened 7 years ago

ErwinKomen commented 7 years ago

The /txtlist command should include relevant (and generic) metadata information per text.

ErwinKomen commented 7 years ago

Steps:

  1. Call a function getMetaInfo() from CrpManager.getTextList()
  2. Make a function to extract the meta information as JSON object from a file: Parse.getMetaInfo()
  3. Make a function to extract one piece of meta information as a String: Parse.getMetaElement()
ErwinKomen commented 7 years ago

1: Done 2: Done 3: in progress...

ErwinKomen commented 7 years ago

I've dealt with [3] now. The metadata elements that are extracted are these:

  1. title
  2. genre
  3. author
  4. date
  5. subtype

A file metaelement.json is read that contains all possible Xpath/Attr combinations where meta elements can be found in the header or in an encompanying .cmdi/.imdi file.