ckan / ckanext-dcat

CKAN ♥ DCAT
164 stars 142 forks source link

Harvesting a dcat catalog raises error #188

Closed quaxsze closed 3 years ago

quaxsze commented 3 years ago

Harvesting a DCAT catalog with python 3 version raises an UnicodeDecodeError

'utf-8' codec can't decode byte 0xc3 in position 1023: unexpected end of data

for chunk in r.iter_content(chunk_size=self.CHUNK_SIZE):
    if six.PY2:
        content = content + chunk
    else:
        content = content + chunk.decode('utf8')

This happens because trying to decode each chunk leads to decoding truncated characters.

Content should be decoded once fully downloaded.