SocialFinanceDigitalLabs / sf-fons-platform

https://github.com/SocialFinanceDigitalLabs/sf-fons
1 stars 0 forks source link

Bug: Spaces in s3 Path breaks pipeline #89

Open MichaelHanksSF opened 3 months ago

MichaelHanksSF commented 3 months ago

In particular, with the external pipeline, having spaces in the name breaks the code. Some initial experiments with escaping and using urllib/path doesn't resolve the issue.

For instance,

fs = open_fs("s3://my-bucket/")
file = open_file(fs, "ONS Area/file.csv")

Fails, but strangely this succeeds:

fs = open_fs("s3://my-bucket/ONS Area")
file = open_file(fs, "file.csv")

Easiest solution is to remove spaces from paths.
Longer-term solution is to find way to handle these sorts of paths.

MichaelHanksSF commented 3 months ago

@patrick-troy to verify no spaces in filenames and paths in:

@MichaelHanksSF to verify no spaces in filenames and paths in:

MichaelHanksSF commented 2 months ago

@patrick-troy can you also ensure that there is a section in the documentation about the path format that needs to be followed