frictionlessdata / frictionless-py

Data management framework for Python that provides functionality to describe, extract, validate, and transform tabular data
https://framework.frictionlessdata.io
MIT License
700 stars 148 forks source link

Failure with Python 3.12 but not Python 3.11: no attribute 'delete' #1642

Open mcarans opened 7 months ago

mcarans commented 7 months ago

I have found an issue with frictionless with Python 3.12 that doesn't occur with Python 3.11. Here is the trace:

../../../.local/share/hatch/env/virtual/hdx-python-utilities/uS5tbIcO/test.py3.12/lib/python3.12/site-packages/frictionless/formats/excel/parsers/xlsx.py:67: in read_loader
    if not target.delete:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <tempfile._TemporaryFileWrapper object at 0x7ff2d54ab350>
name = 'delete'

    def __getattr__(self, name):
        # Attribute lookups are delegated to the underlying file
        # and cached for non-numeric results
        # (i.e. methods are cached, closed and friends are not)
        file = self.__dict__['file']
>       a = getattr(file, name)
E       AttributeError: '_io.BufferedRandom' object has no attribute 'delete'

It can be reproduced with:

from frictionless import Detector, Dialect
from frictionless.formats import ExcelControl
from frictionless.resources import TableResource

url = "https://raw.githubusercontent.com/OCHA-DAP/hdx-python-utilities/main/tests/fixtures/downloader/test_xlsx_processing.xlsx?a=1"

control = ExcelControl()
control.sheet = "Trend"
control.fill_merged_cells = True

detector = Detector()
detector.field_type = "any"
detector.field_float_numbers = True
detector.field_missing_values = [""]

dialect = Dialect()
dialect.header = True
dialect.header_rows = [7, 8]
dialect.skip_blank_rows = True

resource = TableResource(
    path=url,
    format="xlsx",
    control=control,
    detector=detector,
    dialect=dialect,
)
resource.open()
mcarans commented 7 months ago

@roll Just bringing this issue to your attention as I think it will affect anyone reading Excel with frictionless in Python 3.12.

mcarans commented 4 months ago

@roll I tested with frictionless 5.17.0 and unfortunately it doesn't fix this issue

roll commented 4 months ago

Thanks @mcarans!

Yea, I know, I think they found the reason in this thread - https://github.com/Tinche/aiofiles/issues/166

cc @pdelboca

mcarans commented 1 week ago

@roll @pdelboca Tested with 5.17.1 and it still fails.

pierrecamilleri commented 1 week ago

Hi @mcarans, can't reproduce (tried to validate an .xlsx file with python 3.12.6 + frictionless 5.17.1). Could you please provide data and the command to reproduce ? Thanks.

mcarans commented 1 week ago

@pierrecamilleri This will reproduce it:

from frictionless import Detector, Dialect
from frictionless.formats import ExcelControl
from frictionless.resources import TableResource

url = "https://raw.githubusercontent.com/OCHA-DAP/hdx-python-utilities/main/tests/fixtures/downloader/test_xlsx_processing.xlsx?a=1"

control = ExcelControl()
control.sheet = "Trend"
control.fill_merged_cells = True

detector = Detector()
detector.field_type = "any"
detector.field_float_numbers = True
detector.field_missing_values = [""]

dialect = Dialect()
dialect.header = True
dialect.header_rows = [7, 8]
dialect.skip_blank_rows = True

resource = TableResource(
    path=url,
    format="xlsx",
    control=control,
    detector=detector,
    dialect=dialect,
)
resource.open()
pierrecamilleri commented 1 week ago

Thanks :pray: , I can reproduce