Open gitgud5000 opened 1 week ago
I've identified the issue within the _save
method of TensorFlowModelDataset
. Specifically, at line 172, it calls .copy()
method. ( PR https://github.com/kedro-org/kedro-plugins/pull/608)
According to the [fsspec documentation](https://filesystem-spec.readthedocs.io/en/latest/copying.html#:~:text=copy()%20copies%20from%20a%20remote,local%20source%20to%20a%20remote%20target), the .copy()
method is designed for copying files between two remote locations. However, in this case, since we're copying from a local fs to ABS, the correct method would be .put()
rather than .copy()
.
Determine if the target is remote and switching to .put()
accordingly should resolve the issue!
edit: something similar should be done with the _load
method as the appropriate method in this case would be .get()
Ran some test and it seems is not necessary to use .copy()
at all. .get()
and put()
will work with local-to-local copying as well.
Description
I get the following error when trying to save a
TensorFlowModelDataset
to Azure Blob Storage. The issue occurs only in Azure Blob Storage, not in the local filesystem:The error indicates that versioning is enabled despite the
versioned: False
setting in the dataset catalog. The file is never created or exists at any point in the Azure Blob Storage.Context
Dataset Catalog Definition
Steps to Reproduce
Edit: Included code below to reproduce. (credit to @merelcht, @astrojuanlu)
TensorFlowModelDataset
in the Kedro catalog with the configuration above 👆.save()
method targeting Azure Blob Storage.I suspect this issue might be related to the following issue: kedro-plugins/issues/359, as the behavior appears to be similar to what is described there.
Environment
Full Error Traceback
Shorten for readability