VertNet / dwc-indexer

Google App Engine project for indexing DwC text files into Search API Documents
GNU Lesser General Public License v3.0
0 stars 1 forks source link

Missing DwC terms in index #23

Closed tucotuco closed 9 years ago

tucotuco commented 10 years ago

Some fields fields are not making it into the verbatim_record in the index. These include type, rights, and rightsholder. Collections that should have all of these are KU.

tucotuco commented 10 years ago

Check latest harvest to see if these make it into the GCS files.

tucotuco commented 10 years ago

previousIdentifications

See https://github.com/VertNet/webapp/issues/449

laurarussell commented 10 years ago

http://portal.vertnet.org/o/mvz/mvzmammals?id=67504. Record for checking when resolved. Should dwc:previousIdentifications.

tucotuco commented 10 years ago

Recent changes to fields,clj do not affect the harvest file schema. Still ends in namepublishedinyear instead of year.

laurarussell commented 10 years ago

This effects the availability of these fields in downloads too. @tucotuco what is the timeline on this one?We're getting ready to launch a blog post indicating that the VertNet norms are now going to be included on our hosted data sets on every record, but until this is resolved it won't appear in the downloads.

tucotuco commented 10 years ago

See https://github.com/VertNet/webapp/issues/512 for a full list of missing terms.

tucotuco commented 9 years ago

All of the new fields (type, license, rightsholder, references, accessrights, and previousidentifications) are in the new harvest and index, but will not appear in the old index, nor in the portal until the portal has been updated to use the new index.