OregonDigital / OD2

Next generation of Oregon Digital ( https://oregondigital.org ) digital collections platform, built on Samvera Hyrax ( https://github.com/samvera/hyrax/ )
18 stars 1 forks source link

Controlled vocab labels dropped when fetching and indexing again #1198

Closed straleyb closed 4 years ago

straleyb commented 4 years ago

Descriptive summary

If there is any info we would like to catalog, this is the place to do it. My proposed fix was to keep a second copy of metadata around while an object is being updated or added to a collection, or other things like this. Any time a reindex occurs the data wont show up until a duration of time after. But by keeping a second record in solr around, we can use that until the update finishes, then add the new data to the old solr record.

CGillen commented 4 years ago

An idea I had was to merge the current Solr document and the one being generated by the indexer (before the fetch job is started). While generating the replacement Solr document, we can look at the current Solr document and match up any URIs that are in both, then pull down the corresponding label. This would get us temporary labels while the authorities/blazegraph are being hit and any new or updated labels should come through when the async job finishes.

https://github.com/samvera/hyrax/blob/v2.7.2/app/indexers/hyrax/work_indexer.rb#L7 This is basically where we would want to do that work.

CGillen commented 4 years ago

Discussion moved to #1207