intake / intake-xarray

Intake plugin for xarray
https://intake-xarray.readthedocs.io/
BSD 2-Clause "Simplified" License
74 stars 36 forks source link

File does not close #148

Open Pierre-Louis-Boutruche opened 1 month ago

Pierre-Louis-Boutruche commented 1 month ago

When using Intake.open_netcdf('example.nc') and closing via ds.close(), the file is not closed, which causes problems when you want to delete the file in the program, for example.

In Intake-xarray/base.py

def close(self):
    """Delete open file from memory"""
    if self._ds is not None:
        self._ds.close() #Closes the file correctly
    self._ds = None
    self._schema = None

Without this modification, here's the error I get when I try to delete the netcdf file: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'data_2024-07-31.nc'

martindurant commented 1 month ago

I wonder if you would care testing with the very much rewritten code in https://github.com/intake/intake-xarray/pull/147 ?

Pierre-Louis-Boutruche commented 1 month ago

How can I test with this branch?

martindurant commented 1 month ago
pip install git+https://github.com/martindurant/intake-xarray.git@intake2
Pierre-Louis-Boutruche commented 1 month ago

I have an error when i installed this:

$ pip install git+https://github.com/martindurant/intake-xarray.git@intake2
WARNING: Ignoring invalid distribution ~atplotlib (C:\Users\pierre.louis.boutruc\Documents\git\test\.venv\Lib\site-packages)
WARNING: Ignoring invalid distribution ~atplotlib (C:\Users\pierre.louis.boutruc\Documents\git\test\.venv\Lib\site-packages)
Collecting git+https://github.com/martindurant/intake-xarray.git@intake2
  Cloning https://github.com/martindurant/intake-xarray.git (to revision intake2) to c:\users\pierre.louis.boutruc\appdata\local\temp\pip-req-build-t1jkcqeu
  Running command git clone --filter=blob:none --quiet https://github.com/martindurant/intake-xarray.git 'C:\Users\pierre.louis.boutruc\AppData\Local\Temp\pip-req-build-t1jkcqeu'
  Running command git checkout -b intake2 --track origin/intake2
  branch 'intake2' set up to track 'origin/intake2'.
  Switched to a new branch 'intake2'
  Resolved https://github.com/martindurant/intake-xarray.git to commit cca0fac1a5a21415d156013831853169a5e170f3
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [31 lines of output]
      C:\Users\pierre.louis.boutruc\AppData\Local\Temp\pip-req-build-t1jkcqeu\versioneer.py:430: SyntaxWarning: invalid escape sequence '\s'
        LONG_VERSION_PY['git'] = '''
      Traceback (most recent call last):
        File "C:\Users\pierre.louis.boutruc\Documents\git\test\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
          main()
        File "C:\Users\pierre.louis.boutruc\Documents\git\test\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\pierre.louis.boutruc\Documents\git\test\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\pierre.louis.boutruc\AppData\Local\Temp\pip-build-env-y7kq5aqv\overlay\Lib\site-packages\setuptools\build_meta.py", line 327, in get_requires_for_build_wheel    
          return self._get_build_requires(config_settings, requirements=[])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\pierre.louis.boutruc\AppData\Local\Temp\pip-build-env-y7kq5aqv\overlay\Lib\site-packages\setuptools\build_meta.py", line 297, in _get_build_requires
          self.run_setup()
        File "C:\Users\pierre.louis.boutruc\AppData\Local\Temp\pip-build-env-y7kq5aqv\overlay\Lib\site-packages\setuptools\build_meta.py", line 497, in run_setup
          super().run_setup(setup_script=setup_script)
        File "C:\Users\pierre.louis.boutruc\AppData\Local\Temp\pip-build-env-y7kq5aqv\overlay\Lib\site-packages\setuptools\build_meta.py", line 313, in run_setup
          exec(code, locals())
        File "<string>", line 17, in <module>
        File "C:\Users\pierre.louis.boutruc\AppData\Local\Temp\pip-req-build-t1jkcqeu\versioneer.py", line 1511, in get_version
          return get_versions()["version"]
                 ^^^^^^^^^^^^^^
        File "C:\Users\pierre.louis.boutruc\AppData\Local\Temp\pip-req-build-t1jkcqeu\versioneer.py", line 1439, in get_versions
          cfg = get_config_from_root(root)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\pierre.louis.boutruc\AppData\Local\Temp\pip-req-build-t1jkcqeu\versioneer.py", line 342, in get_config_from_root
          parser = configparser.SafeConfigParser()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      AttributeError: module 'configparser' has no attribute 'SafeConfigParser'. Did you mean: 'RawConfigParser'?
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
martindurant commented 1 month ago

What version of python is this? You may nee hatch installed first.

Pierre-Louis-Boutruche commented 1 month ago

My version of python is 3.12.4, I first installed hatch but I always get the error when I install intake2

Pierre-Louis-Boutruche commented 1 month ago

I downloaded python 3.10, and was able to install intake2 correctly. Traceback (most recent call last): File "C:\Users\pierre.louis.boutruc\Documents\git\test\test_udal\beacon.py", line 145, in results = broker._execute_argo(params) File "C:\Users\pierre.louis.boutruc\Documents\git\test\test_udal\beacon.py", line 130, in _execute_argo ds.close() AttributeError: 'NetCDFSource' object has no attribute 'close'

When I don't use ds.close() but directly os.remove, I can't because the file is being used by another process.

martindurant commented 1 month ago

That's right, close() is gone all together, and the file should only be open during the call to read(). I could readd close() as a no-op.

martindurant commented 1 month ago

I wonder if xarray itself keeps hold of an open file? Do you know if you get the same behaviour is you do a

ds = xr.open_dataset(...)
Pierre-Louis-Boutruche commented 1 month ago

With this code:

import xarray as xr
import os

ds = xr.open_dataset("data/iddas_argo_[2015-07-05]_[2020-01-01].nc")

df = ds.to_dataframe()

os.remove("data/iddas_argo_[2015-07-05]_[2020-01-01].nc")

Output:

$ python test.py 
Traceback (most recent call last):
  File "C:\Users\pierre.louis.boutruc\Documents\git\test\test_udal\test.py", line 8, in <module>
    os.remove("data/iddas_argo_[2015-07-05]_[2020-01-01].nc")
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'data/iddas_argo_[2015-07-05]_[2020-01-01].nc'

But if I add ds.close(), this code works

martindurant commented 3 weeks ago

With the intake2 version of this repo, you should be able to do

ds = cat.source.read()

and thereafter only need ds.close() as you would with xarray alone.