sul-dlss-deprecated / rialto-webapp

The web front end of the RIALTO project
4 stars 0 forks source link

Browse details links not working #354

Closed justinlittman closed 5 years ago

justinlittman commented 5 years ago

Steps to reproduce:

  1. Go to Browse section.
  2. Click the title of any result.

Expected result: Go to details page of that item.

Actual result: 404

justinlittman commented 5 years ago

I can't reproduce this locally. Best guess is that it is introduced by shib.

justinlittman commented 5 years ago

After @jonrober removed shib, this started working correctly.

peetucket commented 5 years ago

Yeah, I see that too, works now.

jonrober commented 5 years ago

Doing some testing on prod, it's not shib itself (I think), but something with the proxy in general. If I try to go to:

https://rialto.sul.stanford.edu/#/item/http:%2F%2Fsul.stanford.edu%2Frialto%2Fagents%2Fpeople%2F9db8942eec34affdf15572bb614147c9

The Apache logs show that the server is actually being told to access:

GET /catalog/http%3A%2F%2Fsul.stanford.edu%2Frialto%2Fagents%2Fpeople%2F9db8942eec34affdf15572bb614147c9

Is there anything in the javascript that you can think of that might be affecting this?

jonrober commented 5 years ago

N'm, compared to stage and saw the same behavior, so that's not it. But comparing showed me also that it looks like the proxy isn't seeming to pass along the query string, which would be the problem. Looking into why now.

jonrober commented 5 years ago

I've played with the same shib proxy in front of a basic site and can't replicate it not passing things along, which implies it's something more complicated with the rialto webapp in specific. What all should be passed back and forth during this interaction?

Here's what clicking on a link on the browse page looks like on prod (behind shib):

######################## [9b654b86-7cd2-4e8b-8d02-64e1ade489b1] Solr query: get select {"qt"=>nil, "facet.field"=>["type_ssi", "{!ex=pub_year_ssim_single}pub_year_ssim", "subject_label_ssim", "school_label_ssim", "department_label_ssim", "institute_label_ssim", "institution_label_ssim", "agent_ssim", "concept_labels_ssim", "countries_label_ssim", "created_year_isim", "person_subtype_ssi"], "facet.query"=>[], "facet.pivot"=>[], "fq"=>[], "hl.fl"=>[], "rows"=>25, "qf"=>"title_tesi name_tsim author_label_tsim abstract_tesim identifiers_ssim", "q"=>"", "facet"=>true, "f.pub_year_ssim.facet.limit"=>1001, "f.subject_label_ssim.facet.limit"=>1001, "f.school_label_ssim.facet.limit"=>1001, "f.department_label_ssim.facet.limit"=>1001, "f.institute_label_ssim.facet.limit"=>1001, "f.institution_label_ssim.facet.limit"=>1001, "f.concept_labels_ssim.facet.limit"=>1001, "f.countries_label_ssim.facet.limit"=>1001, "sort"=>"score desc, pub_date_si desc, title_si asc"}  18:45:27 [9b654b86-7cd2-4e8b-8d02-64e1ade489b1] Solr fetch (201.9ms)  18:45:27 [9b654b86-7cd2-4e8b-8d02-64e1ade489b1] method=GET path=/catalog format=json controller=CatalogController action=index status=200 duration=948.02 view=742.19 ########################

And here's what it looks like on stage, not behind shib:

######################## [5d2c528d-e6a3-4049-a8cc-1ef95fde9d2e] Solr query: get select {"qt"=>nil, "facet.field"=>["type_ssi", "{!ex=pub_year_ssim_single}pub_year_ssim", "subject_label_ssim", "school_label_ssim", "department_label_ssim", "institute_label_ssim", "institution_label_ssim", "agent_ssim", "concept_labels_ssim", "countries_label_ssim", "created_year_isim", "person_subtype_ssi"], "facet.query"=>[], "facet.pivot"=>[], "fq"=>[], "hl.fl"=>[], "rows"=>25, "qf"=>"title_tesi name_tsim author_label_tsim abstract_tesim identifiers_ssim", "facet"=>true, "f.pub_year_ssim.facet.limit"=>1001, "f.subject_label_ssim.facet.limit"=>1001, "f.school_label_ssim.facet.limit"=>1001, "f.department_label_ssim.facet.limit"=>1001, "f.institute_label_ssim.facet.limit"=>1001, "f.institution_label_ssim.facet.limit"=>1001, "f.concept_labels_ssim.facet.limit"=>1001, "f.countries_label_ssim.facet.limit"=>1001, "sort"=>"score desc, pub_date_si desc, title_si asc"}  18:47:21 [5d2c528d-e6a3-4049-a8cc-1ef95fde9d2e] Solr fetch (234.4ms)  18:47:21 [5d2c528d-e6a3-4049-a8cc-1ef95fde9d2e] method=GET path=/ format=html controller=CatalogController action=index status=200 duration=1041.29 view=802.81  18:47:22 [6bd5ac33-728c-489c-a020-a5508c6c3330] Solr query: get select {"qt"=>nil, "facet.field"=>"type_ssi", "facet.query"=>[], "facet.pivot"=>[], "fq"=>[], "hl.fl"=>[], "rows"=>0, "qf"=>"title_tesi name_tsim author_label_tsim abstract_tesim identifiers_ssim", "facet"=>true, "f.pub_year_ssim.facet.limit"=>1001, "f.subject_label_ssim.facet.limit"=>1001, "f.school_label_ssim.facet.limit"=>1001, "f.department_label_ssim.facet.limit"=>1001, "f.institute_label_ssim.facet.limit"=>1001, "f.institution_label_ssim.facet.limit"=>1001, "f.concept_labels_ssim.facet.limit"=>1001, "f.countries_label_ssim.facet.limit"=>1001, "sort"=>"score desc, pub_date_si desc, title_si asc", "f.type_ssi.facet.limit"=>21, "f.type_ssi.facet.offset"=>0}  18:47:22 [6bd5ac33-728c-489c-a020-a5508c6c3330] Solr fetch (49.9ms)  18:47:22 [6bd5ac33-728c-489c-a020-a5508c6c3330] method=GET path=/catalog/facet/type_ssi.json format=json controller=CatalogController action=facet status=200 duration=53.10 view=0.61  18:47:24 [205a6226-e6aa-47ff-ac71-90b4fcfc1915] Solr query: get select {"qt"=>nil, "facet.field"=>["type_ssi", "{!ex=pub_year_ssim_single}pub_year_ssim", "subject_label_ssim", "school_label_ssim", "department_label_ssim", "institute_label_ssim", "institution_label_ssim", "agent_ssim", "concept_labels_ssim", "countries_label_ssim", "created_year_isim", "person_subtype_ssi"], "facet.query"=>[], "facet.pivot"=>[], "fq"=>[], "hl.fl"=>[], "rows"=>25, "qf"=>"title_tesi name_tsim author_label_tsim abstract_tesim identifiers_ssim", "q"=>"", "facet"=>true, "f.pub_year_ssim.facet.limit"=>1001, "f.subject_label_ssim.facet.limit"=>1001, "f.school_label_ssim.facet.limit"=>1001, "f.department_label_ssim.facet.limit"=>1001, "f.institute_label_ssim.facet.limit"=>1001, "f.institution_label_ssim.facet.limit"=>1001, "f.concept_labels_ssim.facet.limit"=>1001, "f.countries_label_ssim.facet.limit"=>1001, "sort"=>"score desc, pub_date_si desc, title_si asc"}  18:47:24 [205a6226-e6aa-47ff-ac71-90b4fcfc1915] Solr fetch (210.2ms)  18:47:24 [205a6226-e6aa-47ff-ac71-90b4fcfc1915] method=GET path=/catalog format=json controller=CatalogController action=index status=200 duration=850.26 view=636.15  18:47:26 [03eb2513-cf2f-486f-99d6-c7899e7ae73b] Solr query: get get {:qt=>nil, :ids=>"http://sul.stanford.edu/rialto/publications/92fa5aeff343e8fd53b543dfff6c15c3"}  18:47:26 [03eb2513-cf2f-486f-99d6-c7899e7ae73b] Solr fetch (3.3ms)  18:47:26 [03eb2513-cf2f-486f-99d6-c7899e7ae73b] method=GET path=/catalog/http%3A%2F%2Fsul.stanford.edu%2Frialto%2Fpublications%2F92fa5aeff343e8fd53b543dfff6c15c3 format=json controller=CatalogController action=show status=200 duration=9.57 view=3.70 ########################

So the webapp isn't getting as many calls when behind shib, but I don't know enough about the webapp to know how those calls are normally being made, so I'm not sure why they'd be blocked. Could I get more information about how the application works there?

justinlittman commented 5 years ago
8:47:26
[03eb2513-cf2f-486f-99d6-c7899e7ae73b] Solr query: get get {:qt=>nil, :ids=>"[http://sul.stanford.edu/rialto/publications/92fa5aeff343e8fd53b543dfff6c15c3"}](http://sul.stanford.edu/rialto/publications/92fa5aeff343e8fd53b543dfff6c15c3%22%7D)

18:47:26
[03eb2513-cf2f-486f-99d6-c7899e7ae73b] Solr fetch (3.3ms)

18:47:26
[03eb2513-cf2f-486f-99d6-c7899e7ae73b] method=GET path=/catalog/http%3A%2F%2Fsul.stanford.edu%2Frialto%2Fpublications%2F92fa5aeff343e8fd53b543dfff6c15c3 format=json controller=CatalogController action=show status=200 duration=9.57 view=3.70

is the relevant section.

The app client is making the an XHR call to get info on the publication (/catalog/http%3A%2F%2Fsul.stanford.edu%2Frialto%2Fpublications%2F92fa5aeff343e8fd53b543dfff6c15c3). This results in the webapp making a call to Solr.

I don't see that XHR call in the logs when shib is in place.

jonrober commented 5 years ago

Thanks. Does the XHR call pass along cookies?

Here's the flow as Apache sees it:

Main login: 10.0.1.57 - - [26/Feb/2019:16:33:41 +0000] "POST /Shibboleth.sso/SAML2/POST HTTP/1.1" 302 216 "https://login.stanford.edu/idp/profile/SAML2/Redirect/SSO?execution=e1s2" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0" 10.0.1.57 - jonrober [26/Feb/2019:16:33:41 +0000] "GET / HTTP/1.1" 200 1304 "https://login.stanford.edu/idp/profile/SAML2/Redirect/SSO?execution=e1s2" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0" 10.0.1.57 - jonrober [26/Feb/2019:16:33:42 +0000] "GET /packs/css/rialto-948f80da.css HTTP/1.1" 200 23061 "https://rialto.sul.stanford.edu/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0" 10.0.1.57 - jonrober [26/Feb/2019:16:33:42 +0000] "GET /packs/js/rialto-0c93c93244d662174c2f.js HTTP/1.1" 200 973320 "https://rialto.sul.stanford.edu/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0" 10.0.1.57 - jonrober [26/Feb/2019:16:33:44 +0000] "GET /home_intlcollab.png HTTP/1.1" 200 140301 "https://rialto.sul.stanford.edu/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0" 10.0.1.57 - jonrober [26/Feb/2019:16:33:44 +0000] "GET /StanfordLibraries-logo-cmyk.png HTTP/1.1" 200 6086 "https://rialto.sul.stanford.edu/packs/css/rialto-948f80da.css" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0" 10.0.1.57 - jonrober [26/Feb/2019:16:33:44 +0000] "GET /home_trends.png HTTP/1.1" 200 110164 "https://rialto.sul.stanford.edu/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0" 10.0.1.57 - jonrober [26/Feb/2019:16:33:44 +0000] "GET /home_topicarea.png HTTP/1.1" 200 138987 "https://rialto.sul.stanford.edu/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0" 10.0.1.57 - jonrober [26/Feb/2019:16:33:44 +0000] "GET /catalog/facet/type_ssi.json HTTP/1.1" 200 252 "https://rialto.sul.stanford.edu/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0"

Going to Browse page: 10.0.1.57 - jonrober [26/Feb/2019:16:35:49 +0000] "GET /catalog?q= HTTP/1.1" 200 396832 "https://rialto.sul.stanford.edu/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0"

Clicking on a link: 10.0.1.57 - - [26/Feb/2019:16:36:04 +0000] "GET /catalog/http%3A%2F%2Fsul.stanford.edu%2Frialto%2Fagents%2Fpeople%2F43227 HTTP/1.1" 404 256 "https://rialto.sul.stanford.edu/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0"

No username attached to that last, which would normally show even for a 404, so that implies that the XHR call isn't sending the Shib cookie along with the request. On the other hand, Shibboleth isn't firing off there to try and create a new session either. The other possibility is the proxy itself, unrelated to shib auth, is failing. However it's a pretty simple proxy there. The best test for that would be to put a proxy without shibboleth in front for testing, though that could disrupt things you're working on.

justinlittman commented 5 years ago

At this point, it is OK to disrupt production (cc @aaron-collier).

Here's what a request on prod that results in a 404 looks like as curl:

curl 'https://rialto.sul.stanford.edu/catalog/http%3A%2F%2Fsul.stanford.edu%2Frialto%2Fagents%2Fpeople%2F43227' -H 'pragma: no-cache' -H 'cookie: _ga=GA1.2.914914787.1536061082; mf_dd31473b-d667-4129-b968-22d78b1c69e3=427daf6e1e24fcf733ab016ec0458de1|09191386730a122ca19ab25303c644d68edd92e2.5574678609.1537366094771,09192537fd849c7e72a6407603783982fae0f20b.-2040678692.1537366107727,09244798c0b6464c3b111dd3ed1899aeb4e6708f.8187105520.1537784988732,10123359a1accd984041b0d00b87d11c096329e1.922342169.1539341613661,10231936b2a7b352ed316445b1b2e41f327a0b3c.-2580569789.1540300639439|1540301617974||1|||0|16.01; _lo_uid=73682-1540995307545-9724a96b43307973; __lotl=https%3A%2F%2Fwww.gsb.stanford.edu%2Fexec-ed%2Fprograms%2Fredwood-city%2Fspeaker-series; __lotr=https%3A%2F%2Fwww.google.com%2Furl%3Fq%3Dhttps%253A%252F%252Fwww.gsb.stanford.edu%252Fexec-ed%252Fprograms%252Fredwood-city%252Fspeaker-series%26sa%3DD%26usd%3D2%26usg%3DAFQjCNFgy1Jy8W4swXlMwnuBLDscOq3egg; mf_55d6233e-5245-4a7c-854d-f6e64e753af4=-1; __qca=P0-614389616-1544050457773; optimizelyEndUserId=oeu1544635620724r0.7235776061029218; optimizelySegments=%7B%22558400441%22%3A%22false%22%2C%22559604449%22%3A%22direct%22%2C%22559760501%22%3A%22gc%22%7D; optimizelyBuckets=%7B%7D; __gads=ID=2f983aa50f7e38c6:T=1544635748:S=ALNI_MZFJplafXZ_CoJAwrWDfiJmrwa9Cg; __zlcmid=qbhyhM6gUft8VZ; _fbp=fb.1.1549586085512.1793252899; _gcl_au=1.1.1572393579.1550195446; _A_time=0.1550195461675.1550195461675; lo_session_in=1; _lo_v=5; __unam=f302de2-1660b20bee4-50fe258a-16; _shibsession_64656661756c7468747470733a2f2f7269616c746f2e73756c2e7374616e666f72642e6564752f73686962626f6c657468=_6d8275ad4f108d3f03fcd5f0810edec2; _rialto_webapp_session=Ru2jJNwZeR4R5cxtyYUT9sXXlwnCBCWAMN4RpYF79lg6whgtfuVIF5K7r7QhQjIDODnTY2JHEDN0w1%2FAWZt4jIMYYzPq48NAKPoWfTQHkymSWtHWR20%3D--Rfa6rlttQSGgSAh7--9wFEvMg20Rx4iesAH7iWtw%3D%3D' -H 'accept-encoding: gzip, deflate, br' -H 'accept-language: en-US,en;q=0.9' -H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36' -H 'accept: application/json, text/plain, */*' -H 'cache-control: no-cache' -H 'authority: rialto.sul.stanford.edu' -H 'x-requested-with: XMLHttpRequest' -H 'referer: https://rialto.sul.stanford.edu/' --compressed

Do you see the shib cookie?

Also, if a shib cookie problem would expect all XHR requests to fail, not just this one.

jonrober commented 5 years ago

I definitely wouldn't see the shib cookie being sent on a curl request unless you're explicitly sending. I just don't know enough about XHR to know if those requests normally send cookies.

I'll test with a proxy change and see what that looks like.

justinlittman commented 5 years ago

Sorry -- I didn't fully explain. That curl command is created by Chrome to reproduce the XHR request. So if the XHR request sent a cookie, it would be in that curl command.

jonrober commented 5 years ago

Do you know if your code is using: https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/withCredentials ?

justinlittman commented 5 years ago

Not sure. We're using libraries that are built on top of XMLHttpRequest.

jonrober commented 5 years ago

Could you check those libraries, specifically with whatever the call would be to generate that request? The way I read that page, if they're not making the request using withCredentials, that's going to block most auth from working. It's not a shib-specific thing and not something I can test short of writing something on the proxy to intercept that specific URL path and print what is received.

jcoyne commented 5 years ago

Setting withCredentials has no effect on same-site requests.

jcoyne commented 5 years ago

Every XHR request is exactly the same as a normal request. All the expected cookies are sent along.

jonrober commented 5 years ago

Thanks for the feedback there. Will be doing the proxy checking post-meeting.

jcoyne commented 5 years ago

@jonrober can you confirm that the proxy is not modifying the url (which contains a URL escaped url)? https://rialto.sul.stanford.edu/catalog/http%3A%2F%2Fsul.stanford.edu%2Frialto%2Fagents%2Fpeople%2F43227

jonrober commented 5 years ago

Okay, I found the issue. Since a lot of attempts to try and access paths for badly configured webservers involve sending encoded slashes to try and poke at the local filesystem, Apache has a setting that automatically gives a 404 error to things using %2F in the URL by default, before it hits proxies or any other processing. I've disabled that setting, and prod looks good now to me.

jonrober commented 5 years ago

@jcoyne: It shouldn't be, it's explicitly set to pass things along without affecting, as an earlier fix.

justinlittman commented 5 years ago

Great catch @jonrober. Yet another thing to file away in my mental list of Apache gotchas.