Closed gerritholl closed 3 years ago
I don't know why it still fails on Windows despite replacing / by os.sep.
This still fails on Windows (I don't know why) and it doesn't work as intended with S3FileSystem (it seems globbing behaves differently there): fs_s3.glob("noaa-goes16/ABI-L1b-RadF/2020/045/*")
returns nothing even though fs_s3.ls("noaa-goes16/ABI-L1b-RadF/2020/045/")
does, and fs_s3.glob("noaa-goes16/ABI-L1b-RadF/2020/*")
is very slow, suggesting there is something recursive going on.
Finding files is extremely slow due to globbing in s3fs being extremely slow, probably due to the problem reported at https://github.com/dask/s3fs/issues/378.
Note that at present the s3fs example only works with s3fs 0.5.0 and not with the latest released s3fs 0.5.1, due to the bug reported at https://github.com/dask/s3fs/issues/378. Hopefully that problem will be fixed so that this searching works with s3fs 0.5.2 or 0.6.
Can anyone with access to a Windows Python installation shed light on why the tests may be failing on Windows? Did the tests actually succeed on Windows prior to this PR?
Hi @gerritholl! I can test it on a Windows machine, but not before some time later next week since I don't have a setup ready. However, the tests were working before this PR as you can see in the build log: https://github.com/atmtools/typhon/actions/runs/288892462
Meanwhile https://github.com/dask/s3fs/pull/379 was merged so the typhon functionality and the documented example should work with the latest s3fs master or the next release.
This seems to be caused by inconsistent handling of path separators. One place I found to be inconsistent is in
self.file_system.glob
returns a path with /
separators while regex
uses the \\
version. There might be other places where this causes a problem, but this is all I had time to look at now.
Ah, that makes sense. If we search on a remote file system like this PR intends to support, then the assumption that the file system uses os.sep
as a separator is no longer guaranteed to be true. I will try to think of a solution.
With all the commits to make it work on Windows it has become regrettably somewhat ugly, and perhaps breaking backward compatibility despite tests passing. Can Windows users please comment if it breaks their workflow or not?
I don't know if there are any Windows users of Typhon at the moment, so don't expect an answer to your question. We mainly added Windows as a supported platform because Typhon is a Python-only package and it seemed like the right thing to do to be good Python citizens. Thanks for putting in the work to keep this cross-platform. From my side this is ready to be merged. I'll have a chat with Lukas if he's also fine with it when he's back next week. I don't think you'll need to add John to PRs in the future since he has moved on to other things.
I thought I had write access here, but apparently not (anymore?) as no "merge" button shows up ("Only those with write access to this repository can merge pull requests.).
Merges can only be done by maintainers which are currently Lukas and me.
This PR adds partial support for fsspec AbstractFileSystem implementation in FileSet, in particular for finding files. An example using
sf3s
:Gives: