sul-dlss-labs / ld4p

placeholder github repo for issues, specs and documents for LD4P work
0 stars 1 forks source link

Document Solr and Stanford::Mods normalization methods #76

Closed jgreben closed 7 years ago

jgreben commented 7 years ago

Document of provide a link to an existing document about how Solr intrinsically handles some punctuation, and also about what stanford-mods provides for us in terms of normalization.

ndushay commented 7 years ago

Sadly the documentation is in the stanford-mods code. :-P

ndushay commented 7 years ago

Given the new consul document about SearchWorks indexing: https://consul.stanford.edu/pages/viewpage.action?pageId=156860714

Can we close this?

jgreben commented 7 years ago

👍 I added a link to the schema.xml file and an additional bullet point.

- Look it up in Solr schema.xml.  How the Solr field processes the raw text at index time informs the minimum processing for the field value being sent to Solr.
-- Here you will find how a particular data field is modified/normalized for Solr indexing purposes.
-- is it single valued or multi-valued?
-- how is it modified by Solr processing?
--- tokenized, etc.
ndushay commented 7 years ago

cool! Does that mean we can close this issue? I also added more info on where to look for code in solrmarc-sw.

jgreben commented 7 years ago

I think yes!

Josh

On Apr 19, 2017, at 4:31 PM, Naomi Dushay notifications@github.com<mailto:notifications@github.com> wrote:

cool! Does that mean we can close this issue? I also added more info on where to look for code in solrmarc-sw.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/sul-dlss/ld4p/issues/76#issuecomment-295492419, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AC81Wg4hzujk1otK27nU6VvQqNrs6JbQks5rxplpgaJpZM4M5U-E.