Open hcayless opened 7 years ago
@m-k-r can you provide answers to the above questions in this ticket? We are blocked on papyri.info integration testing until we can get them answered. Thanks.
cc @rla2118 @rogerbagnall @jcowey @HolgerEssler @Edelweiss
Think I figured it out. The script doesn't seem to work, but I was able to extract the Java command from it and run that to generate the files. Are we sure this is what we want for DCLP browsing?
The drop down menu is what we have used. It is not perfect and if it could be made to work in the same way as ddbdp it would be better. The categories series; tm number; authors and works are categories that are definitely wanted. Does that help and is it clear?
@rla2118 @rogerbagnall @HolgerEssler please advise as to whether you want the drop-down menu for browsing or would prefer something more like ddbdp. Please note that there may not be time/resources available to implement anything different from what is already in place on litpap.info, but I think we should try to capture the preference in any case.
I ran into this problem myself last week. This script assumes that saxon is exported in the classpath. But if it was exported webrick doesn't work. A possible solution would be to provide saxon and call it by absolute path.
The navigator works for dclp the same way as for ddbdp, hgv or apis. The dropdown list is just an additional way to sort and categorize.
saxon is already in sosol. Could this be used for the navigator or is it better to not have cross dependencies?
the exist-update script is called by pn-sync. This way the dclp part is separated from papyri and can be replaced if another solution is found, like moving the existdb database to solr.
If it is not documented, the application used in existdb can be found here.
Should I rewrite this Guide with our additions?
Cross-dependencies aren't really the issue. SoSOL and the PN run in different containers, and pretty much have to because they have quite different resource usage profiles. The Editor is much more resource-intensive (because of JRuby its heap is basically a giant hairball of HashMaps), but the Navigator gets much more use and is much leaner, and we wouldn't want to have the Editor bring the whole thing down if (when) it starts running out of memory.
If this ends up being the way we do things, I'd move the XSLT transform part into pn-indexer, which already does a lot of XSLT (and has Saxon available), and call it directly within pn-sync without having to resort to the shell script. I don't like having the script overloaded like this. It's obscure.
It sounds like the plan is to leave the eXist setup at Heidelberg, so the script (minus the DCLP browse-building part) is probably fine. Maybe it could even just be triggered by a cron job.
So what are the next steps here?
per @rla2118 and @jcowey the preference is to have DDB/HGV/APIS-style browse rather than the drop-down if at all possible. With the caveat that browse by authors+works, TM, and editions is essential.
Pinging @m-k-r and @hcayless RE my question above about what the tasks on this are now and also making reference to my most recent comment. Note priority upgrade.
Thanks @m-k-r . So I'm confused: what's all this about a drop-down and how it's different from the regular-style browse? And what are the other matters involved in this ticket (if any)?
@hcayless @m-k-r @jcowey can any of you help clarify?
The dropdown menu was originally a dirty fix when the regular browsing wasn't working yet. We kept the dropdown menu because the search picks up DCLP and DDBDP together and the dropdown menu only lists DCLP.
I think one next step would be to move the pn-scripts/generateCorpusOverview.xsl
call out of the exist-update.sh
script and into pn-indexer/src/info/papyri/indexer.clj
(this may also require moving the XSL file itself into pn-xslt/
).
Edit: Also, the top-level DCLP browse at http://litpap.info/browse/dclp/ seems to be fine without exist-update.sh
/generateCorpusOverview.xsl
being called, because it's generated by pn-dispatcher
. However, DCLP TM, "by series", and authors+works are generated by that XSLT. Is "by series" (http://litpap.info/browse/dclp/series) also crucial? It seems to be the same as the hierarchical top-level view at http://litpap.info/browse/dclp/, just flattened. Maybe this is different because the end result of using the hierarchical view is a Solr search by series which doesn't differentiate between DCLP and DDbDP?
@jcowey, @rla2118, and @rogerbagnall can you please comment on the observations and questions implied in @ryanfb's comment, immediately preceding? Thanks.
I don't have any comment on the technical questions at stake, but it's not obvious to me why the series view is essential. It may have value for someone else that I'm not aware of.
@jcowey and I discussed this on Skype. I'll be providing more clarity shortly.
I think there are two related issues being discussed here:
I think this warrants two separate tickets. Accordingly, I am now proclaiming the current ticket (#288) to be about "how and where" (i.e., the technical aspects). I have created a new ticket (#303) for the "what do we want" discussion. I am marking the technical ticket (this one) as "blocked" until we get #303 sorted.
questions on #303 are now sorted so this issue is no longer blocked. Over to @hcayless to decide disposition of this ticket in light of what he's working on.
I'm trying to set up DCLP on the ISAW Atlantides server, with mixed success. The pages are getting created, and at least partially indexed, but browsing doesn't work at all. The Apache config points at pages that were removed a few weeks ago. See https://github.com/DCLP/navigator/commit/bfa0219173ab655c7d4c34b8643e14fe7926aabc, but https://github.com/DCLP/navigator/commit/3f8d55a1f048c066098044285bc93de718edf903. It looks like maybe https://github.com/DCLP/navigator/blob/master/pn-scripts/exist-update.sh is meant to generate those files and also do something with eXist? How and when is this supposed to be run?