aiddata / geo-datasets

Scripts for preparing datasets in GeoQuery
http://geoquery.org
MIT License
20 stars 11 forks source link

Add tmp_to_dst_file context manager to Dataset class #160

Closed jacobwhall closed 1 year ago

jacobwhall commented 1 year ago

tmp_to_dst_file() takes final_dst argument, yields a temporary file path that you can write to. When the context manager exits, it moves the temporary file to final_dst

Resolves #148

jacobwhall commented 1 year ago

I've tested the basic functionality of this PR with cru_ts. Here is an example of it being used.

sgoodm commented 1 year ago

Slight issue was encountered when using an NFS mounted dir for the tmp directory where the .nfs file cannot be removed from the tmpdir and causes the tempdir context manager's dir cleanup to fail and raise an exception (rmtree fails on .nfs file w/ OSError: [Errno 16] Device or resource busy: '.nfs000 error). See MRC below w/ originall CM commented out and solution implemented.

99% solution seems to be just manually creating the tmpdir ourselves (rather than using a context manager). Since the tmp_to_dst_file context manager is only for a single file, which we move from the tmp_dir, there isn't really a need to delete the full tmp dir.

import os 
import time
import shutil
from tempfile import TemporaryDirectory, mkstemp, mkdtemp
from contextlib import contextmanager

@contextmanager
def tmp_to_dst_file(final_dst, tmp_dir=None):
    # with TemporaryDirectory(dir=tmp_dir) as tmp_sub_dir:
    tmp_sub_dir = mkdtemp(dir=tmp_dir)
    tmp_file, tmp_path = mkstemp(dir=tmp_sub_dir)
    # tmp_file.close()
    print(tmp_file, tmp_path)
    print(f"Created temporary file {tmp_path} with final destination {str(final_dst)}")
    yield tmp_path
    try:
        shutil.move(tmp_path, final_dst)
        # shutil.remove(tmp_path)
    except:
        print(f"Failed to transfer temporary file {tmp_path} to final destination {str(final_dst)}")
    else:
        print(f"Successfully transferred {tmp_path} to final destination {str(final_dst)}")
        # tmp_file.close()
        # shutil.rmtree(tmp_sub_dir)
        time.sleep(5)
        print(os.listdir(tmp_sub_dir))
        print(tmp_sub_dir)

final_dl_path = '/sciclone/home20/smgoodman/test.txt'
# tmp_dir = '/sciclone/home20/smgoodman/'
tmp_dir = None

with tmp_to_dst_file(final_dl_path, tmp_dir=tmp_dir) as dl_dst:
    print(f"Starting")
    try:
        print(f"Trying")    
        with open(dl_dst, 'w') as f:
            f.write('test')
    except:
        print(f"Failed")
    else:
        print(f"Success")

# when using scr20, tmpdir still has .nfs file in it until python session ends
# print(os.listdir(tmp_sub_dir))