aai-institute / lakefs-spec

An fsspec implementation for the lakeFS project.
http://lakefs-spec.org/
Apache License 2.0
37 stars 4 forks source link

Improve test coverage on lakefs abstract base classes #201

Closed AdrianoKF closed 6 months ago

AdrianoKF commented 7 months ago

This PR adds a variety of smoke tests that aim to cover the default functionality provided by the abstract base classes from fsspec, namely AbstractFileSystem and AbstractBufferedFile.

In order to avoid the problem discovered by one of the tests and reported upstream ( https://github.com/treeverse/lakeFS/issues/7127), the version of lakeFS dependencies has been bumped to v1.3.1 in the CI and dev environments.

[!NOTE] These tests are not meant to provide complete coverage of the base implementations, since these are tested as part of fsspec already. Rather, they aim to uncover any potential issues that arise from the mix of overwritten and default implementations and their interactions (e.g., quite a few implementations in the ABC rely on the return value from ls, which is overwritten in the library).

nicholasjng commented 6 months ago

Looks like _strip_protocol is coming back to bite us. Should we test with the following this time?


@classmethod
def _strip_protocol(cls, path):
    pass

This will work as long as paths are passed directly to API calling methods, which parse the path. If the path is used outside otherwise, we might need to roll back the overload deletion.

maxmynter commented 6 months ago

Do you mean just adding the _strip_protocol with pass as a method to our LakeFSFileSystem. Or are other changes involded?

Because, if we just add the above, we get TypeErrors in parse, at least for find and resultingly methods that rely on find such as glob.

Are there other implications to reinstating the overloads?

maxmynter commented 6 months ago

As a user, I would expect for find to work also on a branch without providing a ref in the branch. Currently it does not work, as the trailing / gets stripped of and parse throws an error.

But I believe we @AdrianoKF built a small fix for this when we paired earlier today. As also the test_walk_repo_root fails for the same reason, I believe that is the issue.

nicholasjng commented 6 months ago

That fix (adding back the trailing slash) was the point of our previous overload. So I think we might want to reapply that in this issue as well.

EDIT: Commit in question is c0daa3f04dd6739ea7dfdd3dc905f6f84d7c2ab6, so git revert c0daa3f should do the trick.

maxmynter commented 6 months ago

I have spent some time with the touch fsspec command and did not get it to work. The problem is that the objects_api.upload_object in our commit function does not accept an empty content / mime type. If we set one, we get an error that the local file at the path is not found.

So I believe touch does not work with the lakefs-sdk API. I could also find nothing on these notes in the lakeFS docs.

Should we overwrite the method in our filesystem with a not implemented error? Or is there another way to make it work?

nicholasjng commented 6 months ago

I have spent some time with the touch fsspec command and did not get it to work. The problem is that the objects_api.upload_object in our commit function does not accept an empty content / mime type. If we set one, we get an error that the local file at the path is not found.

So I believe touch does not work with the lakefs-sdk API. I could also find nothing on these notes in the lakeFS docs.

Should we overwrite the method in our filesystem with a not implemented error? Or is there another way to make it work?

I raised this with the lakeFS devs in https://github.com/treeverse/lakeFS/issues/7127, which they resolved right away.

This means that touch will not work for lakeFS servers with version v1.3.0 or older, but anything afterwards might.

codecov[bot] commented 6 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (23254ca) 88.23% compared to head (b4e5293) 89.80%. Report is 14 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #201 +/- ## ========================================== + Coverage 88.23% 89.80% +1.57% ========================================== Files 7 7 Lines 510 520 +10 Branches 79 82 +3 ========================================== + Hits 450 467 +17 + Misses 31 27 -4 + Partials 29 26 -3 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.