Open agrinh opened 1 year ago
To answer part of your question, yes you can use protocol-specific arguments to configure the HTTP backend:
>>> of = fsspec.open("http://google.com", http={"encoded": True})
>>> of.fs.encoded
True
(This is exactly equivalent to fsspec.open("http://google.com", encoded=True)
)
The second part, of not passing options that might have been intended for other backends is undefined behaviour. I can see how it can be convenient, but
The intended use was originally only for multi-component URLs like "simplecache::http://server/path", where we know the two protocols involved, and can find the args to send to each; any "extra" kwargs always go to the foremost component, in this case simplecache.
@martindurant Thanks for the quick reply!
I understand, neither is a great option. Perhaps the best option is doing the opposite? I.e. allowing / requiring per protocol defaults in a specific argument that only passes down kwargs to the relevant implementation? Something like:
fsspec.open(..., protocol_defaults={"az": {"anon": False}, "http": {"encoded": True}})
I realize this is a bit less convenient, but it's fairly confusing as is now with protocol-specific arguments making it down to the individual implementations.
That could be a possible solution, but we could not disallow az= directly now, as it is already in use; at least, not without a proper deprecation. I'm not convinced that the longer form would be very popular.
Setting protocol specific options has been a convenient method for overriding the default options for each protocol. E.g., the Azure blob storage implementation behaves peculiarly and requires setting
anon=False
to use the credentials in the environment (https://github.com/fsspec/adlfs/issues/348).So for paths provided by an application, we might do:
This option is ignored for local paths, and used for
az://
protocol urls, and therefore allows us to configure defaults for each protocol. Unfortunately, this doesn't work withhttps(s)
protocol urls, since the kwargs are directly forwarded toaiohttp
, e.g. https://github.com/fsspec/filesystem_spec/blob/561428ca18a9865d8f63fe188a590d791ec52c92/fsspec/implementations/http.py#L826fsspec.open
method the convinience of this API greatly diminishes.Either way, I'm happy to contribute code if we can agree on a solution.