nulib / arch

Northwestern University institutional repository, built on Samvera's Hyrax gem.
2 stars 0 forks source link

IR OAI PMH is turned on #9

Closed davidschober closed 7 years ago

davidschober commented 8 years ago

FINAL:

NOTES Note, we're not sure whether this needs to be turned on in Fedora... there is a library.

We need to investigate how to deal with the Fedora server. We may need to open the tomcat/fedora server to the world. We don't know how to do this right.

Project github repo and README: https://github.com/fcrepo4-labs/fcrepo4-oaiprovider

Installation wiki: https://wiki.duraspace.org/display/FEDORA40/Setup+OAI-PMH+Provider

davidschober commented 8 years ago

@d-venckus Can we get an update on this?

davidschober commented 8 years ago

@johndorr is this still valid? There was a bit of confusion.

johndorr commented 8 years ago

Yes it is.

From: davidschober notifications@github.com<mailto:notifications@github.com> Reply-To: nulib/institutional-repository reply@reply.github.com<mailto:reply@reply.github.com> Date: Monday, October 24, 2016 at 10:03 AM To: nulib/institutional-repository institutional-repository@noreply.github.com<mailto:institutional-repository@noreply.github.com> Cc: John Dorr john.dorr@northwestern.edu<mailto:john.dorr@northwestern.edu>, Mention mention@noreply.github.com<mailto:mention@noreply.github.com> Subject: Re: [nulib/institutional-repository] OAI PMH is turned on (#9)

@johndorrhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_johndorr&d=CwMCaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=9jEy68olo9R0Zy1Ki9eGz04_pbtiZNC6ARW5PFNO0Ps&m=56t7GCEx7neIIeAyt0UB5PxMi0nd0B2gh0-OYoXjqTg&s=YK2BPP9pH79BO8B7ZOkKOsk9_hjuit0pNrz8VRvlEf4&e= is this still valid? There was a bit of confusion.

You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3Agithub.com_nulib_institutional-2Drepository_issues_9-23issuecomment-2D255766773&d=CwMCaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=9jEy68olo9R0Zy1Ki9eGz04_pbtiZNC6ARW5PFNO0Ps&m=56t7GCEx7neIIeAyt0UB5PxMi0nd0B2gh0-OYoXjqTg&s=9NFTTXnhGwWxTEzArTDkrvGy7hN1FSMxsXfZDCoy7fY&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3Agithub.com_notifications_unsubscribe-2Dauth_AV1Mf4dsV7L1AxsG0ZSWqMBMAmQMbooIks5q3MjTgaJpZM4KU7hk&d=CwMCaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=9jEy68olo9R0Zy1Ki9eGz04_pbtiZNC6ARW5PFNO0Ps&m=56t7GCEx7neIIeAyt0UB5PxMi0nd0B2gh0-OYoXjqTg&s=RT_j3_WFgb7CNO0MZzlmAiUk15v2FZ-lEPsEtcgwbtY&e=.

davidschober commented 8 years ago

OK. David can you keep digging in.

On Oct 25, 2016, at 1:35 PM, John Dorr notifications@github.com wrote:

Yes it is.

From: davidschober notifications@github.com<mailto:notifications@github.com> Reply-To: nulib/institutional-repository reply@reply.github.com<mailto:reply@reply.github.com> Date: Monday, October 24, 2016 at 10:03 AM To: nulib/institutional-repository institutional-repository@noreply.github.com<mailto:institutional-repository@noreply.github.com> Cc: John Dorr john.dorr@northwestern.edu<mailto:john.dorr@northwestern.edu>, Mention mention@noreply.github.com<mailto:mention@noreply.github.com> Subject: Re: [nulib/institutional-repository] OAI PMH is turned on (#9)

@johndorrhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_johndorr&d=CwMCaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=9jEy68olo9R0Zy1Ki9eGz04_pbtiZNC6ARW5PFNO0Ps&m=56t7GCEx7neIIeAyt0UB5PxMi0nd0B2gh0-OYoXjqTg&s=YK2BPP9pH79BO8B7ZOkKOsk9_hjuit0pNrz8VRvlEf4&e= is this still valid? There was a bit of confusion.

You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3Agithub.com_nulib_institutional-2Drepository_issues_9-23issuecomment-2D255766773&d=CwMCaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=9jEy68olo9R0Zy1Ki9eGz04_pbtiZNC6ARW5PFNO0Ps&m=56t7GCEx7neIIeAyt0UB5PxMi0nd0B2gh0-OYoXjqTg&s=9NFTTXnhGwWxTEzArTDkrvGy7hN1FSMxsXfZDCoy7fY&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3Agithub.com_notifications_unsubscribe-2Dauth_AV1Mf4dsV7L1AxsG0ZSWqMBMAmQMbooIks5q3MjTgaJpZM4KU7hk&d=CwMCaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=9jEy68olo9R0Zy1Ki9eGz04_pbtiZNC6ARW5PFNO0Ps&m=56t7GCEx7neIIeAyt0UB5PxMi0nd0B2gh0-OYoXjqTg&s=RT_j3_WFgb7CNO0MZzlmAiUk15v2FZ-lEPsEtcgwbtY&e=. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nulib/institutional-repository/issues/9#issuecomment-256131234, or mute the thread https://github.com/notifications/unsubscribe-auth/AElKnidfDTJLSD06i6m593wWvvL03kmjks5q3kv0gaJpZM4KU7hk.

davidschober commented 8 years ago

opened bug report @d-venckus waiting to hear back.

davidschober commented 7 years ago

@d-venckus do you want @Toputnal to take a look at this?

davidschober commented 7 years ago

@d-venckus and @Toputnal another thought is to use a solr OAI provider. I'm not sure how much work it would take to develop the profiles but it's a thought. https://github.com/IISH/oai4solr

It maybe more efficient as well.

csyversen commented 7 years ago

@davidschober it looks like we can easily restrict the OAI provider to specific solr cores, so that's good and we don't have to worry about opening up all of our solr cores to OAI if we don't want to (but maybe that's a good idea)

it does look like it'll require a xslt file to translate our solr xml schema into a oai-dc format though (along with some other configuration) so it's not like we can just flip a switch and we're done.

Do we know how easy it is to enable oai-pmh in fedora? @mbklein do you know?

davidschober commented 7 years ago

it’s turned on on images. But I think it needs a meaningful dc data stream to make it work right. So some work is still necessary. e.g.

http://repository.library.northwestern.edu/fedora/oai?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:example.org:inu:dil-5f4c79ec-9f65-4e41-88aa-b4f1e128829b http://repository.library.northwestern.edu/fedora/oai?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:example.org:inu:dil-5f4c79ec-9f65-4e41-88aa-b4f1e128829b

venckus was working on turning it for IR but I was wondering if this would be a better/more generic solution.

On Jan 4, 2017, at 2:45 PM, Chris Syversen notifications@github.com wrote:

@davidschober https://github.com/davidschober it looks like we can easily restrict the OAI provider to specific solr cores, so that's good and we don't have to worry about opening up all of our solr cores to OAI if we don't want to (but maybe that's a good idea)

it does look like it'll require a xslt file to translate our solr xml schema into a oai-dc format though (along with some other configuration) so it's not like we can just flip a switch and we're done.

Do we know how easy it is to enable oai-pmh in fedora? @mbklein https://github.com/mbklein do you know?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nulib/institutional-repository/issues/9#issuecomment-270481541, or mute the thread https://github.com/notifications/unsubscribe-auth/AElKnixcRO9ECj_htKqORol_ezdiJffOks5rPATbgaJpZM4KU7hk.

csyversen commented 7 years ago

@davidschober then the same work might need to be done in fedora too. We'd probably still have to create a xslt to transform our data (in this case it'd be Images VRA metadata) into oai-dc.

my impression is that it's a six of one half dozen of another situation between enabling oai on solr vs fedora

Toputnal commented 7 years ago

I don't yet know enough to have an opinion one way or the other. Please give me a direction in which to hack @davidschober @csyversen @d-venckus @mbklein

davidschober commented 7 years ago

@Toputnal let's install the Fedora one first.

csyversen commented 7 years ago

This is on my list of things to talk to MBK about too, so hopefully we can get more guidance soon

On Wed, Jan 18, 2017 at 10:08 AM, davidschober notifications@github.com wrote:

@Toputnal https://github.com/Toputnal let's install the Fedora one first.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nulib/institutional-repository/issues/9#issuecomment-273518442, or mute the thread https://github.com/notifications/unsubscribe-auth/ADgkuEl0VBuyeleizDc2dujXcrNSqEwBks5rTjjrgaJpZM4KU7hk .

davidschober commented 7 years ago

@mbklein can you take a look and chime in on direction?

Toputnal commented 7 years ago

Well, this might help with direction. Other than Arch, all our staging Fedora instances (libfedora{1,2,3}s) are running some version of Fedora 3. The instructions above are for Fedora 4. I'm looking for the equivalent instructions for Fedora 3, but haven't found much as of yet. As for Solr, we have libsolr1s and libsolr2s which are both running 4.9.0. Should I focus on getting IR OAI PMH working via Solr, instead? @csyversen @davidschober @d-venckus @mbklein Also, any ideas what might get splattered if we start adding stuff to our running Solr cores? Would we just need to let Solr re-index in a worst case scenario?

davidschober commented 7 years ago

Thanks for digging into this @Toputnal. This issue is specifically scoped to the Fedora4 instance running ARCH.

OAI is turned on for our Fedora 3 boxes, but many records don't have the DC metadata to make it relevant. OAI PMH Identify on repository.library . I'm pretty sure it's out of the box on 3.8 as opposed to an add in.

OAI PMH is a little different in Fedora 3.8

Toputnal commented 7 years ago

I have installed the OAI provider on nufiarepo-s according to the instructions, but I don't think it works. If someone can give me a hand on this, I would appreciate it. I'm not sure where to begin troubleshooting other than trying to find clues in catalina.out, which I've been doing.

Toputnal commented 7 years ago

@csyversen @d-venckus ^^^

davidschober commented 7 years ago

Put a query on the slack channel.

Toputnal commented 7 years ago

I put a query in on the Hydra channel and got a reference to: https://groups.google.com/d/msg/fedora-tech/wxi8h9I55GM/fqal3dlNBAAJ which, basically, shows that the OAI-PMH module is a community supported module with no maintainer. While others also have interest, only one school appears to have gotten things working on specific 4.2, but the work is not portable to 4.x and there are customizations in the version of 4.2 that that school has compiled that may or may not work with what we have.

Toputnal commented 7 years ago

We are running Fedora 4.6 on sufiarepo-s, for the record.

davidschober commented 7 years ago

Awesome! Thanks Jim. Let’s pull this back to “ready” and look into using the SOLR adapter.

On Jan 24, 2017, at 8:55 AM, James R Bottino notifications@github.com wrote:

I put a query in on the Hydra channel and got a reference to: https://groups.google.com/d/msg/fedora-tech/wxi8h9I55GM/fqal3dlNBAAJ https://groups.google.com/d/msg/fedora-tech/wxi8h9I55GM/fqal3dlNBAAJ which, basically, shows that the OAI-PMH module is a community supported module with no maintainer. While others also have interest, only one school appears to have gotten things working on specific 4.2, but the work is not portable to 4.x and there are customizations in the version of 4.2 that that school has compiled that may or may not work with what we have.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nulib/institutional-repository/issues/9#issuecomment-274826543, or mute the thread https://github.com/notifications/unsubscribe-auth/AElKnppCLaFzKoh24qXTtppw8hvR92rSks5rVhD7gaJpZM4KU7hk.

davidschober commented 7 years ago
csyversen commented 7 years ago

https://github.com/projecthydra-labs/share_notify !

i think this might be our way forward! If our goal is to get our documents and projects over to OSF, then we can take alternate routes instead of OAI-PMH.

csyversen commented 7 years ago

(also i'm going to unassign myself for right now since this is work that's going to have to go back into the backlog for a little bit (we shouldn't be working on new features right before a release))

davidschober commented 7 years ago

agreed @csyversen. @cpd3149 and @johndorr we should talk more about this. Briefly the OAI-PMH extensions are no longer being actively worked on and don't really work.

chrisdaaz commented 7 years ago

i brought this up at a breakout session at code4lib, and it was suggested we don't do this from fedora but from the hydra layer. frankly, i don't know what this means, but my question about oai-pmh with fedora 4 was met with suspicion.

chrisdaaz commented 7 years ago

i think one of the reasons why oai-pmh isn't working on the more recent versions of the stack (fedora, solr, backlight) is because of the newer oai resourcesync spec: http://www.openarchives.org/rs/1.1/resourcesync

which appears to be auto-enabled in sufia: https://github.com/projecthydra/sufia/wiki/Feature-matrix

i'm not sure how to verify that this works and is a sufficient replacement for oai-pmh.

chrisdaaz commented 7 years ago

looks like resourcelist does this: https://arch.library.northwestern.edu/capabilitylist

and

https://arch.library.northwestern.edu/resourcelist

@nulib/repodev

chrisdaaz commented 7 years ago

sorry, i want to look into this some more. it seems i jumped the gun on the resourcesync discovery..

davidschober commented 7 years ago

@cpd3149 let's write up an issue about resourcelist . I move to close this.

chrisdaaz commented 7 years ago

@davidschober i sent the resourcelist file to SHARE OSF last month (3/13), but we're still pending approval. i'll follow up with them on the status. if they confirm that they can harvest from resourcelist, we can close. if not, we may need to look into something like: https://github.com/code4lib/ruby-oai

chrisdaaz commented 7 years ago

i heard back from OSF about ResourceSync: they said that "ResourceSync looks like a sufficient alternative" but they haven't added support for that protocol yet ("will take a long time").

Between ResourceSync and the share_notify gem (#184), we should be covered.