archesproject / arches

Arches is a web platform for creating, managing, & visualizing geospatial data. Arches was inspired by the needs of the Cultural Heritage community, particularly the widespread need of organizations to build & manage cultural heritage inventories
GNU Affero General Public License v3.0
210 stars 142 forks source link

Django default_storage breaks on non-ascii characters in files #11337

Closed whatisgalen closed 1 day ago

whatisgalen commented 4 weeks ago

When importing a csv, often Django's default_storage module is used to handle the file, like so:

with default_storage.open(csv_file_path, mode="r") as csvfile:
            reader = csv.reader(csvfile)
            data = {"csv": [line for line in reader], "csv_file": csv_file_name}

However the default encoding used is ascii, creating an error for non-ascii characters. Django's default_stoarge.open method does not take an encoding= kwarg like the python open method does, so there is no good way to force utf-8 encoding when opening files this way.

jacobtylerwalls commented 3 weeks ago

@whatisgalen you might consider posting on the django forum to gauge appetite for doing the same thing for the storage interface that was done for the file interface in 5.0 to allow all kwargs through to open() (assuming still a problem in 5.1)

whatisgalen commented 3 weeks ago

@whatisgalen you might consider posting on the django forum to gauge appetite for doing the same thing for the storage interface that was done for the file interface in 5.0 to allow all kwargs through to open() (assuming still a problem in 5.1)

Just posted!

jacobtylerwalls commented 1 day ago

However the default encoding used is ascii

Just noting that this platform dependent. Were you testing with Windows?

whatisgalen commented 21 hours ago

Just noting that this platform dependent. Were you testing with Windows?

Ubuntu 20 actually