Closed o-smirnov closed 6 months ago
I think it is used as follows in dask-ms
: https://github.com/ratt-ru/dask-ms/blob/a0043fba3eae3eabdbdd6e2fb1f22abf7d762dbb/daskms/fsspec_store.py#L17
Edit: Your use may actually be simpler as you probably don't need to know whether it is zarr, parquet or casa table backed.
Just as a note to self before I forget: the reason this matters (as opposed to why just not make the MS name input a plain string) is that the singularity backend needs to know which directories need to be accessed, so that they can be bound inside the container. For MSs nested under the CWD, this doesn't matter since the CWD is always bound. Where this creates a problem is if the MS is somewhere else in the directory hirearchy.
fsspec looks overly complicated for what I need, so rather not add the extra dependency. All I need to know is, is a given string a dask-ms URL or a path to a local file?
A simple regex will do. I just need to know what the possibilities to match are. Hence, question for @JSKenyon @sjperkins, is it true that all dask-ms URLs look like foo::bar://baz
or bar://baz
?
This is probably a reasonable subset:
/path/to/wsrt.ms
file://path/to/wsrt.ms
s3://host.address/path/to/wsrt.zarr
Thanks. Finally, what's a good name for this dtype? MSX
? DaskMS
? DMS
?
I think the above are fairly generic url schema's. I wouldn't say they're dask-ms specific. Would a url
dtype work?
I think the above are fairly generic url schema's. I wouldn't say they're dask-ms specific. Would a
url
dtype work?
Thinking about this a bit more, perhaps uri
would be better than url
as it references both local and remote datasets.
Since dask-ms apps can use an S3 backend for their MSs, the current
MS
dtype is not quite adequate. Introduce a new type that is fsspec-aware. @JSKenyon @sjperkins got an example of how to query an fsspec?