hipstas / AudiAnnotate

Workflows for generating AV editions and exhibits using IIIF manifests by HiPSTAS and Brumfield Labs.
https://hipstas.github.io/AudiAnnotate/
Apache License 2.0
15 stars 8 forks source link

Player display issue with HRC Beecher audio #120

Open bethanycayeradcliff opened 4 years ago

bethanycayeradcliff commented 4 years ago

Describe the bug This is an issue that has happened before, but the Beecher audio recording from this page (direct link to audio here) is displaying with a broken player (see screenshot below).

To Reproduce Steps to reproduce the behavior:

  1. Go to the pages site for the project here: https://bethanycayeradcliff.github.io/racism-in-the-US/-r_0124_01_01-criminal-syndicalism-case-mccomb-mississippi-john-beecher-collection/#?c=&m=&s=&cv=
  2. You can see the broken player.

I wonder if because of the /byte/json the formatting is breaking?

Expected behavior The player should not be displaying inaccurately. Screenshots Screen Shot 2020-10-06 at 2 04 42 PM

Additional context

I think this is what had happened before when I tried this audio, and the reason I deleted the original repo before our workshop :)

bethanycayeradcliff commented 4 years ago

@kywark just so you know--I added side 1 of the Beecher audio and it broke the player :(

saracarl commented 4 years ago

Response headers:

curl --dump-header headers.txt https://server15878.contentdm.oclc.org/dmwebservices/index.php?q=dmGetStreamingFile/p15878coll1/34.mp3/byte/json

HTTP/1.1 200 OK Date: Wed, 04 Nov 2020 15:33:03 GMT Server: Apache X-Content-Type-Options: nosniff X-XSS-Protection: 1; mode=block Accept-Ranges: 0-16082590 Content-Range: bytes 0-16082589/16082590 Content-Length: 16082590 Content-Type: audio/mpeg

saracarl commented 4 years ago

This is definitely an issue with the CORS headers on the audio file itself.

When we examine the headers with the following: curl --dump-header headers.txt https://server15878.contentdm.oclc.org/dmwebservices/index.php?q=dmGetStreamingFile/p15878coll1/34.mp3/byte/json

We see: HTTP/1.1 200 OK Date: Wed, 04 Nov 2020 15:33:03 GMT Server: Apache X-Content-Type-Options: nosniff X-XSS-Protection: 1; mode=block Accept-Ranges: 0-16082590 Content-Range: bytes 0-16082589/16082590 Content-Length: 16082590 Content-Type: audio/mpeg

We believe to work with a remote streaming audio file the CORS header needs to be

Access-Control-Allow-Origin: Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept, Authorization Access-Control-Allow-Methods: GET, OPTIONS Access-Control-Request-Method:

Here's an example of a working, streaming mpeg: https://saracarl.github.io/test/kut-recording/

And it's headers are:
HTTP/1.1 200 OK Content-Disposition: attachment X-Ais-Podcast-Cache: Disk Content-Type: audio/mpeg Cache-Control: max-age=0 Accept-Ranges: bytes Last-Modified: Thu, 05 Nov 2020 21:37:22 GMT Content-Length: 432319548 Connection: keep-alive Instance-id: 779971e84c8464a59a87930a8fa2ee05 Server: AIS Streaming Server 8.3.2 Set-Cookie: AISSessionId=0CD_382_309__b251c80012cce1dbfce5ecb3881f005e3935f088; Path=/; Domain=.streamguys1.com; Max-Age=6000; Expires=Thu, 05 Nov 2020 23:25:13 GMT

Here's a working example from the Internet Archive: https://saracarl.github.io/test/ia-streaming-audio-test/ https://ia800504.us.archive.org/2/items/gd1995-07-09.sbd.miller.114369.flac16/gd95-07-09d2t06.mp3 Headers: HTTP/1.1 200 OK Server: nginx/1.16.1 (Ubuntu) Date: Thu, 05 Nov 2020 21:53:50 GMT Content-Type: audio/mpeg Content-Length: 13411482 Last-Modified: Mon, 04 Jul 2011 00:39:38 GMT Connection: keep-alive ETag: "4e110bca-cca49a" Strict-Transport-Security: max-age=15724800 Expires: Fri, 06 Nov 2020 03:53:50 GMT Cache-Control: max-age=21600 Access-Control-Allow-Origin: * Accept-Ranges: bytes

benwbrum commented 4 years ago

I've asked OCLC support to help us out on this by fixing CONTENTdm's CORS headers.

benwbrum commented 3 years ago

More info from HRC:

Disabling the download link doesn't prevent downloading via the API, so I think there is hope. I'm not familiar with Professor Clement's tool so couldn't say whether this information will be especially helpful, but the API calls that fetch the stream or download the file for the first recording in the record you referenced (https://hrc.contentdm.oclc.org/digital/collection/p15878coll1/id/37/rec/1) are:

https://hrc.contentdm.oclc.org/utils/getstream/collection/p15878coll1/id/33

https://hrc.contentdm.oclc.org/utils/getfile/collection/p15878coll1/id/33/filename/34.mp3

How do you work out what values to plug into the API? If you click on the second file associated with the record and then back onto the first, the URL changes from that of the record's landing page to it's actual URL: https://hrc.contentdm.oclc.org/digital/collection/p15878coll1/id/33 That's all you need to get the stream, just change "digital" to "utils/getstream". The filename at the end of the second API call is probably just the pointer (id) plus one, but I think if you guess the wrong filename it still downloads a file but instead of the MP3 file you'll get an XML document containing the filenames associated with the record, misnamed as an MP3 file.

bethanycayeradcliff commented 3 years ago

Also I created a project here with a few "tests" of the HRC links http://audiannotate.brumfieldlabs.com/project/bethanycayeradcliff/testing-HRC-Beecher-Links

I also made items in the project using the links with GetFile (https://hrc.contentdm.oclc.org/utils/getfile/collection/p15878coll1/id/33/filename/33.mp3) and GetStream (https://hrc.contentdm.oclc.org/utils/getstream/collection/p15878coll1/id/33) that Chris Jahnke from the Ransom Center provided

Both links also create a broken player.

bethanycayeradcliff commented 3 years ago

Also added a new item in the same project to test with a UT Box Link: https://bethanycayeradcliff.github.io/testing-HRC-Beecher-Links/r_0124_01_01-box-link/#?c=&m=&s=&cv= Also did not work, though I know we have had some streaming issues with Box.