kaaveland / pyarrowfs-adlgen2

Use pyarrow with Azure Data Lake gen2
MIT License
25 stars 6 forks source link

Not compatible with pyarrow>=5.0.0 #11

Closed sisaberi closed 3 years ago

sisaberi commented 3 years ago

It seems pyarrowfs-adlgen2 is not compatible with the latest versions of pyarrow.

I'm on 0.2.1 and I get this error when trying to write files:

File "pyarrow/_fs.pyx", line 671, in pyarrow._fs.FileSystem.open_output_stream File "pyarrow/error.pxi", line 143, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow/_fs.pyx", line 1171, in pyarrow._fs._cb_open_output_stream TypeError: open_output_stream() takes 2 positional arguments but 3 were given

if this is known it might be useful to add this to setup.py.

Kind regards

kaaveland commented 3 years ago

That's a bug, I'll take a look at fixing it later this week. I guess arrow changed the fs-api under our feet. :-/

kaaveland commented 3 years ago

Right, docs aren't updated yet, but they did add mandatory new arguments to open_output_stream that pyarrowfs-adlgen2 isn't accepting. I'm not exactly sure how to handle this yet. Will I break some other version if I add this parameter as optional? :thinking:

kaaveland commented 3 years ago

That API-change is actually nice though, it'll make it way easier to set eg. Content-Type than before: https://github.com/apache/arrow/commit/81039729bd0b575e5abc2fca4b61f1c909b0e786#diff-31f7eed91f9f94f6bffdc1c1294db7c32cebd1ecb88786c68c11456c4bdb4e9c

But I guess I might have to do some check of pyarrow version in order to support it.

kaaveland commented 3 years ago

Sat down to handle this now, and found out the implementation must rely on this class: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/storage/azure-storage-file-datalake/azure/storage/filedatalake/_models.py#L243

That's a generated class, in _models, which isn't great in terms of documenting how to use this new metadata option.

kaaveland commented 3 years ago

@sisaberi -- I've released 0.2.2 now, which should fix this issue unless I'm terrible at testing software. :-)