Open bethanycayeradcliff opened 4 years ago
@kywark just so you know--I added side 1 of the Beecher audio and it broke the player :(
Response headers:
curl --dump-header headers.txt https://server15878.contentdm.oclc.org/dmwebservices/index.php?q=dmGetStreamingFile/p15878coll1/34.mp3/byte/json
HTTP/1.1 200 OK Date: Wed, 04 Nov 2020 15:33:03 GMT Server: Apache X-Content-Type-Options: nosniff X-XSS-Protection: 1; mode=block Accept-Ranges: 0-16082590 Content-Range: bytes 0-16082589/16082590 Content-Length: 16082590 Content-Type: audio/mpeg
This is definitely an issue with the CORS headers on the audio file itself.
When we examine the headers with the following: curl --dump-header headers.txt https://server15878.contentdm.oclc.org/dmwebservices/index.php?q=dmGetStreamingFile/p15878coll1/34.mp3/byte/json
We see: HTTP/1.1 200 OK Date: Wed, 04 Nov 2020 15:33:03 GMT Server: Apache X-Content-Type-Options: nosniff X-XSS-Protection: 1; mode=block Accept-Ranges: 0-16082590 Content-Range: bytes 0-16082589/16082590 Content-Length: 16082590 Content-Type: audio/mpeg
We believe to work with a remote streaming audio file the CORS header needs to be
Access-Control-Allow-Origin: Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept, Authorization Access-Control-Allow-Methods: GET, OPTIONS Access-Control-Request-Method:
Here's an example of a working, streaming mpeg: https://saracarl.github.io/test/kut-recording/
And it's headers are:
HTTP/1.1 200 OK
Content-Disposition: attachment
X-Ais-Podcast-Cache: Disk
Content-Type: audio/mpeg
Cache-Control: max-age=0
Accept-Ranges: bytes
Last-Modified: Thu, 05 Nov 2020 21:37:22 GMT
Content-Length: 432319548
Connection: keep-alive
Instance-id: 779971e84c8464a59a87930a8fa2ee05
Server: AIS Streaming Server 8.3.2
Set-Cookie: AISSessionId=0CD_382_309__b251c80012cce1dbfce5ecb3881f005e3935f088; Path=/; Domain=.streamguys1.com; Max-Age=6000; Expires=Thu, 05 Nov 2020 23:25:13 GMT
Here's a working example from the Internet Archive: https://saracarl.github.io/test/ia-streaming-audio-test/ https://ia800504.us.archive.org/2/items/gd1995-07-09.sbd.miller.114369.flac16/gd95-07-09d2t06.mp3 Headers: HTTP/1.1 200 OK Server: nginx/1.16.1 (Ubuntu) Date: Thu, 05 Nov 2020 21:53:50 GMT Content-Type: audio/mpeg Content-Length: 13411482 Last-Modified: Mon, 04 Jul 2011 00:39:38 GMT Connection: keep-alive ETag: "4e110bca-cca49a" Strict-Transport-Security: max-age=15724800 Expires: Fri, 06 Nov 2020 03:53:50 GMT Cache-Control: max-age=21600 Access-Control-Allow-Origin: * Accept-Ranges: bytes
I've asked OCLC support to help us out on this by fixing CONTENTdm's CORS headers.
More info from HRC:
Disabling the download link doesn't prevent downloading via the API, so I think there is hope. I'm not familiar with Professor Clement's tool so couldn't say whether this information will be especially helpful, but the API calls that fetch the stream or download the file for the first recording in the record you referenced (https://hrc.contentdm.oclc.org/digital/collection/p15878coll1/id/37/rec/1) are:
https://hrc.contentdm.oclc.org/utils/getstream/collection/p15878coll1/id/33
https://hrc.contentdm.oclc.org/utils/getfile/collection/p15878coll1/id/33/filename/34.mp3
How do you work out what values to plug into the API? If you click on the second file associated with the record and then back onto the first, the URL changes from that of the record's landing page to it's actual URL: https://hrc.contentdm.oclc.org/digital/collection/p15878coll1/id/33 That's all you need to get the stream, just change "digital" to "utils/getstream". The filename at the end of the second API call is probably just the pointer (id) plus one, but I think if you guess the wrong filename it still downloads a file but instead of the MP3 file you'll get an XML document containing the filenames associated with the record, misnamed as an MP3 file.
Also I created a project here with a few "tests" of the HRC links http://audiannotate.brumfieldlabs.com/project/bethanycayeradcliff/testing-HRC-Beecher-Links
I also made items in the project using the links with GetFile (https://hrc.contentdm.oclc.org/utils/getfile/collection/p15878coll1/id/33/filename/33.mp3) and GetStream (https://hrc.contentdm.oclc.org/utils/getstream/collection/p15878coll1/id/33) that Chris Jahnke from the Ransom Center provided
This item page is from the link with "GetFile": https://bethanycayeradcliff.github.io/testing-HRC-Beecher-Links/r_0124_01_01-link-with-getfile/
This is the item page from the link with "GetStream": https://bethanycayeradcliff.github.io/testing-HRC-Beecher-Links/r_0124_01_01-link-with-getstream/
Both links also create a broken player.
Also added a new item in the same project to test with a UT Box Link: https://bethanycayeradcliff.github.io/testing-HRC-Beecher-Links/r_0124_01_01-box-link/#?c=&m=&s=&cv= Also did not work, though I know we have had some streaming issues with Box.
Describe the bug This is an issue that has happened before, but the Beecher audio recording from this page (direct link to audio here) is displaying with a broken player (see screenshot below).
To Reproduce Steps to reproduce the behavior:
I wonder if because of the /byte/json the formatting is breaking?
Expected behavior The player should not be displaying inaccurately. Screenshots
Additional context
I think this is what had happened before when I tried this audio, and the reason I deleted the original repo before our workshop :)