Closed milutz closed 4 years ago
The main reason is that the APIs were changing pretty fast and I didnt had time to keep updating the code with those changes.
I would be happy to merge a PR that unpins them and fixes whatever the new APIs have.
@danielfrg Coolness! (and many thanks for the quick reply!)
I'm fighting a problem in my own code that I think may be fixed by newer versions - once I figure that out (if its true), I'll circle back and submit a PR for the change - Thanks for the background!!
... so found a wrinkle with dask/s3fs newer than 0.2.2, the "info" call, done in s3_fs.py
passes refresh=True
(such as https://github.com/danielfrg/s3contents/blob/master/s3contents/s3_fs.py#L123 ), which is listed as a valid arg in the dash/s3fs docs (https://s3fs.readthedocs.io/en/latest/api.html#s3fs.core.S3FileSystem.info) , but isn't valid on dask/s3fs 0.3.0 or newer by code ( I.E. https://github.com/dask/s3fs/blob/0.3.0/s3fs/core.py#L441 )
Still digging to see what to do about it
This issue might be relevant : https://github.com/dask/s3fs/issues/207
Hi, I got the error message too when I try putting s3fs==0.2.2. Looking at s3fs code base I see quite a load of refactoring in s3fs so not being a pro at this I hope someone could pick it up and make some changes so that it can be compatible working with dask.
@thbeh I just now built with s3fs==0.2.2 and gcsfs>=0.3.0 from my test fork ( https://github.com/milutz/s3contents/tree/patch-1 ) and I did get something that would let me startup jupyter with c.NotebookApp.contents_manager_class = S3ContentsManager
I haven't done much testing yet, but at least I can now startup. I'll play around a bit and see if anything explodes - and please post if you know of a problem.
I need to get something running now, so I'm trying this quick-fix of downgrading, but I suspect the real fix will be along the lines of what is in the dask issue I pointed at describes ( https://github.com/dask/s3fs/issues/207 ) - its offering that you can call invalidate_cache()
in place of doing the refresh=True
- its unclear to me if they are complete the same thing though.
@danielfrg you have any insight from when you originally added the refresh=True
on the s3_fs.py
calls?
@milutz using s3fs==0.2.2 did work for me too but errors only when I try using it in dask environment where dask requires(deploy) s3fs >=0.3.0 to work with s3.
@thbeh Ah, that makes sense. Yes, I'll run into that problem soon too. I'll likely have to look the invalidate_cache()
direction, unless @danielfrg knows of some reason that's not a good fix
Hi, I'm stuck with this s3fs version too do you have an idea of when you want to fix this point? Thank you for your work!
A potential workaround i'm using is to create two separate virtual environments - one for jupyter (with the pinned s3fs version) and one for the kernel (with newer versions)
I made a PR for supporting newer versions of s3fs. However it seems that s3fs>=0.3 requires python3, causing travis CI to fail. FYI @danielfrg https://github.com/danielfrg/s3contents/pull/86
fixed on new versions
Both s3fs and gcsfs are pinned in the requirements to versions more then a year old. Any particular reason why?