freelawproject / doctor

A microservice for document conversion at scale
https://free.law/projects/doctor
BSD 2-Clause "Simplified" License
54 stars 14 forks source link

Unable to convert audio file due to encoding issue #153

Open sentry-io[bot] opened 1 year ago

sentry-io[bot] commented 1 year ago

UnicodeEncodeError: 'latin-1' codec can't encode character '\u2019' in position 162: ordinal not in range(256)

Sentry Issue: DOCTOR-N

UnicodeEncodeError: 'latin-1' codec can't encode character '\u2019' in position 162: ordinal not in range(256)
(3 additional frame(s) were not displayed)
...
  File "doctor/tasks.py", line 479, in set_mp3_meta_data
    audio_file.tag.audio_source_url = audio_data["download_url"]
  File "eyed3/id3/tag.py", line 797, in audio_source_url
    self._setUrlFrame(frames.URL_AUDIOSRC_FID, url)
  File "eyed3/id3/tag.py", line 759, in _setUrlFrame
    self.frame_set[fid] = frames.UrlFrame(fid, url)
  File "eyed3/id3/frames.py", line 421, in __init__
    self.url = url
  File "eyed3/id3/frames.py", line 432, in url
    url.encode(ISO_8859_1)  # Likewise, it must encode