Open fderue opened 6 years ago
See https://github.com/Ouranosinc/pyPavics/tree/issue_19
When I try to test, I get the following message:
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: <html^><head><title>Apache Tomcat/7.0.63 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 404 - /twitcher/ows/proxy/thredds/dodsC/birdhouse/wps_outputs/flyingpigeon/fd63dce0-1c35-11e8-84ca-0242ac12000d/4b66a300-1c36-11e8-84ca-0242ac12000d.nc.dds</h1><HR size="1" noshade="noshade"><p><b>type</b> Status report</p><p><b>message</b> <u>/twitcher/ows/proxy/thredds/dodsC/birdhouse/wps_outputs/flyingpigeon/fd63dce0-1c35-11e8-84ca-0242ac12000d/4b66a300-1c36-11e8-84ca-0242ac12000d.nc.dds</u></p><p><b>description</b> <u>The requested resource is not available.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/7.0.63</h3></body></html>
We'll need some help on this one.
threddsclient.crawl as it is right now should work properly (I test it for myself). The modifications done in issue_19 branch looks fine (from a static analysis) as long as the dict contained by netc.cookie has a key starting with auth (should be validated in a debug session). This modification will let the crawler support thredds server behind a proxy doing 302 redirect (which has been replaced in the current PAVICS config, so no hurry here)
When I tried to test it, I had errors that I don't understand the origin of. So holding off until someone can test the patch and confirm it works.
The URL returned in the message doesn't have an hostname, could this be the source of the error?
https://github.com/Ouranosinc/pyPavics/blob/47d016dd0df80fd1923b0d3d59067ee308c59889/pavics/catalog.py#L278
When there is a redirection (302), you need to transfer the cookie explicitly if using the library "requests" (which is used by threddsclient.crawl) Fix: